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Abstract 

Randomness extractors and error correcting codes are fundamental objects in computer sci¬ 
ence. Recently, there have been several natural generalizations of these objects, in the context 
and study of tamper resilient cryptography. These are seeded non-malleable extraetors, intro¬ 
duced by Dodis and Wichs |DW09) : seedless non-malleable extractors, introduced by Cheraghchi 
and Guruswami [CG14b] : and non-malleable codes, introduced by Dziembowski, Pietrzak and 
Wichs [DPWIO] . Besides being interesting on their own, they also have important applications 
in cryptography. For example, seeded non-malleable extractors are closely related to privacy 
amplification with an active adversary, non-malleable codes are related to non-malleable secret 
sharing, and seedless non-malleable extractors provide a universal way to construct explicit 
non-malleable codes. 

However, explicit constructions of non-malleable extractors appear to be hard, and the known 
constructions are far behind their non-tampered counterparts. Indeed, the best known seeded 
non-malleable extractor requires min-entropy rate at least 0.49 [Lil2b] : while explicit construc¬ 
tions of non-malleable two-source extractors were not known even if both sources have full 
min-entropy, and was left as an open problem in [GG14b) . In addition, current constructions of 
non-malleable codes in the information theoretic setting only deal with the situation where the 
codeword is tampered once, and may not be enough for certain applications. 

In this paper we make progress towards solving the above problems. Our contributions are 
as follows. 

• We construct an explicit seeded non-malleable extractor for min-entropy k > log^ n. This 
dramatically improves all previous results and gives a simpler 2-round privacy amplification 
protocol with optimal entropy loss, matching the best known result in [Lil5a) . 

• We construct the first explicit non-malleable two-source extractor for min-entropy k > 

n — with output size and error 2“"^^^'. 

• We motivate and initiate the study of two natural generalizations of seedless non-malleable 

extractors and non-malleable codes, where the sources or the codeword may be tampered 
many times. For this, we construct the first explicit non-malleable two-source extractor 
with tampering degree t up to which works for min-entropy k > n — with 

output size and error We further show that we can efficiently sample uni¬ 

formly from any pre-image. By the connection in |GG14b) . we also obtain the first explicit 
non-malleable codes with tampering degree t up to relative rate jn, and error 
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1 Introduction 


Randomness extractors are fundamental objects in the study of randomness in computation. They 
are efficient algorithms that transform imperfect randomness into almost uniform random bits. 
Here we use the standard model of weak random source to model imperfect randomness. The 
min-entropy of a random variable X is defined as Hoo{X) = mina,gSupp(x) log 2 (l/= a^])- For 
a source X supported on {0,1}”, we call X an {n, Hoo{X))-sowi:ce, and we say X has entropy rate 
Hoo{X)/n. 

As one can show that no deterministic extractor works for all weak random sources even with 
min-entropy A: = n — 1, randomness extractors are studied in two different settings. In one setting 
the extractor is given a short independent uniform random seed, and these extractors are called 
seeded extractors. Informally, a seeded extractor Ext : {0,1}” x {0, —>■ {0,1}™ for min-entropy 

k and error e takes as input any (n, k) source X and a uniform seed S, and has the property that 
|Ext(A, S) — Um\ < e, where the distance used is the standard statistical distance. If the output of 
the extractor is guaranteed to be close to uniform even after seeing the value of the seed S, then 
it is called a strong seeded extractor. In the other setting there is no such random seed, but the 
source is assumed to have some special structure. These extractors are called seedless extractors 
(see Section [3] for formal definitions). A special kind of seedless extractors that received a lot of 
attention is extractors for independent weak random sources. Here one can use the probabilistic 
method to show that such extractors exist for only two independent sources (such extractors are 
called two-source extractors), but the known constructions are still not optimal. 

Both kinds of extractors have been studied extensively, and shown to have many connections 
and applications in computer science. Eor example, seeded extractors can be used to simulate 
randomized algorithms with access to only weak random sources, and are closely related to pseu¬ 
dorandom generators, error-correcting code and expanders. Independent source extractors can be 
used to generate high quality random bits for distributed computing and cryptography [KLRZOS] , 
[KLROQ] . and are closely related to Ramsey graphs and other seedless extractors. 

In cryptographic applications, however, one faces a new situation where the inputs of an ex¬ 
tractor may be tampered by an adversary. For example, an adversary may tamper with the seed of 
a seeded extractor, or both sources of a two-source extractor. In this case, one natural question is 
how the output of the tampered inputs will depend on the output of the initial inputs. In order to 
be resilient to adversarial tampering, one natural way is to require that original output of the ex¬ 
tractor be (almost) independent of the tampered output. This leads to the notion of non-malleable 
extractors, in both the seeded case and seedless case. These extractors not only are interesting in 
their own rights, but also have important applications in cryptography. 

Definition 1.1 (Tampering Funtion). For any function f : S ^ S, f has a fixed point at s a S if 
f{s) = s. We say f has no fixed points inT F S, if f{t) t for all t gT. f has no fixed points if 
f{s) 7 ^ s for all s G S. 

Seeded non-malleable extractors were introduced by Dodis and Wichs in [DWOQj . as a general¬ 
ization of strong seeded extractors. 

Definition 1.2 (Non-malleable extractor). A function snmExt : {0,1}” x {0,1}'^ ^ {0,1}”* is 
a seeded non-malleable extractor for min-entropy k and error e if the following holds : If X is a 
source on {0,1}” with min-entropy k and A : {0,1}” —)• {0,1}” is an arbitrary tampering function 
with no fixed points, then 

|snmExt(A, I/d) o snmExt(A, A(t/rf)) o Ud — Um ° snmExt(A, A([/(i)) o Ud\ <e 
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where Um is independent of Ud and X. 

The original motivation for seeded non-malleable extractors is to study the problem of privacy 
amplification with an active adversary. This is a basic problem in information theoretic cryptogra¬ 
phy, where two parties want to communicate with each other to convert their shared secret weak 
random source X into shared secret nearly uniform random bits. However, the communication 
channel is watched by an adversary Eve, where we assume Eve has unlimited computational power 
and the two parties have local (non-shared) uniform random bits. 

In the case where Eve is passive (i.e., can only see the messages but cannot change them), this 
problem can be solved by just applying a strong seeded extractor. However, in the case where Eve is 
active (i.e., can arbitrarily change, delete and reorder messages), the problem becomes much more 
complicated. The major goal here is to design a protocol that uses as few number of interactions as 
possible, and output a uniform random string R that has length as close to Hoo{X) as possible (the 
difference is called entropy loss). There has been extensive research on this problem (we give more 
details in Section [1.4p . Along the line, a major progress was made by Dodis and Wichs |DW09] . 
who showed that seeded non-malleable extractors can be used to construct privacy amplihcation 
protocols with optimal round complexity and entropy loss. 

This connection makes constructing non-malleable extractors a very promising approach to 
privacy amplihcation. However, all known constructions of such extractors f |DLWZ14] . [CRS14] . 
|Lil2a| . |DY13| . |Lil2b] ) require the entropy of the weak source to be at least 0.49n. Moreover, 
all known constructions are essentially based on known two-source extractors, and the entropy 
requirement is exactly the same as the best known two-source extractor |Bou05| . Thus the general 
feeling is that to construct explicit seeded non-malleable extractors for smaller entropy may be 
difficult, as is the situation for two-source extractors. In this work, somewhat surprisingly, we show 
that this is not the case. We dramatically improve all previous results and give explicit seeded 
non-malleable extractors that work for any min-entropy k > log^ n. Apart from applications to 
cryptography, this may of independent interest due to the connection found between seeded non- 
malleable extractors and two-source extractors in |Lil2b] . 

We now discuss the seedless variant of non-malleable extractors. Cheraghchi and Guruswami 
|CG14b| introduced seedless non-malleable extractors as a natural generalization of seeded non- 
malleable extractors. Eurthermore, they found an elegant connection between seedless non-malleable 
extractors and non-malleable codes, which are a generalization of error-correcting codes to handle a 
much larger class of tampering functions (rather than just bit erasure or modification). Informally, 
non-malleable codes are w.r.t a family of tampering functions J-, and require that the decoding 
of any codeword that is tampered by a function f £ X, is either the original message itself or 
something totally independent of the message (see Section [□]). Non-malleable codes have also 
been extensively studied recently (we provide more details in Section ini), and Gheraghchi and 
Guruswami |CG14b] showed a universal way of constructing explicit non-malleable codes by first 
constructing non-malleable seedless extractors. 

In this paper we focus on one of the most interesting and well studied family of tampering 
functions, where the function tampers the original message independently in two separate parts. 
This is called the 2-split-state model (see Section If.II for a formal discussion). The correspond¬ 
ing seedless non-malleable extractor is then a generalization of two-source extractors, where both 
sources can be tampered. Eor ease of presentation,we present a simplified definition here and we 
refer the reader to Section 0] for the formal definition. 

Definition 1.3 (Seedless 2-Non-Malleable Extractor). A function nmExt : {0,1}” x {0,1}"" —)■ {0, 
1}™ is a seedless 2-non-malleahle extractor at min-entropy k and error e if it satisfies the following 
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property: If X and Y are independent {n, k)-sources and A = {f,g) is an arbitrary 2-split-state 
tampering function, such that at least one of f and g has no fixed points, then 

|nmExt(X, Y) o nmExt(^(X, Y)) — Um ° nmExt(/(X), ( 7 (y))| < e 
where both Um’s refer to the same uniform m-hit string. 

Again, the connection in |CG14b] makes constructing seedless 2-non-malleable extractors a very 
interesting and promising approach to non-malleable codes in the 2-split-state model. However, 
no explicit constructions of 2-non-malleable extractors were known even when both sonrces are 
perfectly nniform. Indeed, finding an explicit constrnction of such extractors was left as an open 
problem in [CG14b| , and none of the known constructions of seeded non-malleable extractors seem 
to satisfy this stronger notion. In this paper we solve this open problem and give the first explicit 
construction of 2-non-malleable extractors. Eurthermore we show that given any output of the 
extractor, we can efficiently sample uniformly from its pre-image. By the connection in |CG14b] 
this also gives explicit non-malleable codes in the above mentioned well studied 2-spht-state model. 

We note that onr results about non-malleable codes in the 2-split-state model do not improve 
the already nearly optimal construction in the recent work of Aggarwal et al. [ADKOl^ . However, 
our construction of seedless 2-non-malleable extractors is of independent interest, and provides a 
more direct way to constrnct non-malleable codes 0 

Einally, as in the case of seeded non-malleable extractors [GRS14] . we consider the situation 
where the sonrces can be tampered many times. Eor this, we introduce a natural generalization 
of seedless 2-non-malleable extractors which we call seedless (2, t)-non-malleable extractors (i.e., 
the sources are tampered t times). Correspondingly, in the case of non-malleable codes we also 
consider the situation where a codeword can be tampered many times. Eor this, we also introduce 
a natural generalization of non-malleable codes which we call one-many non-malleable codes (see 
Section [ni. We initiate the study of these two objects in this paper and show that one-many 
non-malleable codes have several natural and interesting applications in cryptography. 

We present a simplified definition of seedless (2, t)-non-malleable extractors here, and refer the 
reader to Section |4] for the formal definition. 

Definition 1.4 (Seedless (2,t)-Non-Malleable Extractor). A function nmExt : {0, !}"■ x {0,1}” —>• 
{o,ir is a seedless {2, t)-non-malleable extractor at min-entropy k and error e if it satisfies the 
following property: If X andY are independent {n,k)-sources and Ai = {fi,gi),... ,At = ift,gt) 
are t arbitrary 2-split-state tampering functions, such that for each i € {1,... ,f} at least one of fi 
and gi has no fixed points, then 

|nmExt(A, F), nmExt(^i(A, Y)), ..., nmExt(.4,t(A, Y)) — 

Cm, nmExt(./li(A, F)),... , nmExt(.4.t(A, F))| < e, 

where both Um’s refer to the same uniform m-bit string. 

We provide explicit constructions of seedless (2, t)-non-malleable extractors for t up to for a 
small enough constant 5. Just as the connection between 2-non-malleable extractors and regular 
non-malleable codes, we show that these extractors lead to explicit constructions of one-many non- 
malleable codes in the 2-split-state model. We note that as in the case of regnlar non-malleable 

Hn | ADK015 |. non-malleable codes in the 2-split-state model are constructed by giving efficient reductions from 
the 2-split-state model to t-split-state model, and then using a known constructions of NM codes in the t-split-state 
model with almost optimal parameters |CZ14| . 
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codes, the construction based on (2, t)-non-malleable extractors may not be the only way to con¬ 
struct one-many non-malleable codes. However, it appears non-trivial to extend other existing 
constructions of non-malleable codes to satisfy this stronger notion. We discuss this in more details 
in Section 11.31 

We now formally define one-many non-malleable codes below. 

1.1 Non-malleable codes 

We introduce the notion of what we call one-many and many-many non-malleable codes, generaliz¬ 
ing the notion of non-malleable codes introduced by Dziembowski, Pietrzak and Wichs [DPWIO] . 
Since the introduction of non-malleable codes, there has been a flurry of recent work on finding 
explicit constructions, resulting in applications to tamper-resilient cryptography [DPWIO] . robust 
versions of secret sharing schemes [ADL14| . and connections to the seemingly unrelated area of 
derandomization [CG14b| . We discuss prior work in detail in Section [T31 

We briefly motivate the notion of non-malleable codes. Traditional error-correcting codes encode 
a message m into a longer codeword c enabling recovery of m even after part of c is corrupted. We 
can view this corruption as a tampering function / acting on the codeword, where / is from some 
small allowable family F of tampering functions. The strict requirement of retrieving the encoded 
message m imposes restrictions on the kind of tampering functions that can be handled. One might 
hope to achieve a weaker goal of only detecting errors, possibly with high probability. However the 
notion of error detection fails to work with respect to the family of constant functions since one 
cannot hope to detect errors against a function that always outputs some fixed codeword. 

The notion of non-malleable codes is an elegant generalization of error-detecting codes. In¬ 
formally, a non-malleable code with respect to a tampering function family F is equipped with 
a randomized encoder Enc and a deterministic decoder Dec such that Dec(Enc(m)) = m and for 
any tampering function f & F the following holds: for any message m, Dec(/(Enc(m))) is either 
the message m or is e-close (in statistical distance) to a distribution Dj independent of m. The 
parameter e is called the error. Thus, in some sense, either the message arrives correctly, or, the 
message is entirely lost and becomes gibberish. A formal definition of non-malleable codes is given 
below. 

First we define the replace function replace : {0,1}* x {0,1}* —>■ {0,1}*. If the second input to 
replace is a single value s, replace all occurrences of same* in the first input with s and output the 
result. If the second input to replace is a set (si,..., Sn), replace all occurrences of same*i in the 
first input with Sj for all i and output the result. 

Definition 1.5 (Coding schemes). Let Enc : {0,1}^ —)• {0,1}"" and Dec : {0,1}” —)■ {0,1}^ U {T} 
be functions such that Enc is a randomized function (i.e. it has access to a private randomness) 
and Dec is a deterministic function. We say that (Enc, Dec) is a coding scheme with block length n 
and message length k if for all s G {0,1}^, Pr[Dec(Enc(s)) = s] = 1 (the probability is over the 
randomness in Encj. 

Definition 1.6 (Non-malleable codes). A coding scheme (Enc, Dec) with block length n and message 
length k is a non-malleable code with respect to a family of tampering functions F C Fn and error e 
if for every f € F there exists a random variable Df on {0,1}^ U {same*} which is independent of 
the randomness in Enc such that for all messages s G {0,1}*', it holds that 

|Dec(/(Enc(s))) — replace(Dj, s)l < e 
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The rate of a non-malleable code C is given by Observe that to construct non-malleable codes, 
it is still necessary to restrict the class of tampering functions. This follows since the tampering 
function could then use the function Dec to decode the message m, get a message m' by flipping all 
the bits in m, and use the encoding function to pick any codeword in Enc(m'). However presumably, 
the class of tampering functions can now be much richer than what was possible for error correction 
and error detection. 

Tampering Multiple Codewords. Observe that the above definition envisions the adversary 
receiving a single codeword Enc(s) and outputting a single tampered codeword /(Enc(s)). We 
refer to this as the “one-one” setting. While indeed this is very basic, we argue that this does not 
capture scenarios where the adversary may be getting multiple codewords as input or be allowed 
to output multiple codewords. As an example, consider the following. 

Say there is an auction where each party can submit its bid, and, the item goes to the highest 
bidder. An honest party, wishing to bid for value s, encodes its bid using NM codes and sends 
Enc(s). This indeed would prevent an adversary (which belongs to an appropriate class of tampering 
functions) from constructing his own bid by tampering Enc(s) and coming up with Enc(s-l-l), which 
would completely compromise the sanity of the auction process. However what if the adversary 
can submit two bids out of which exactly one is guaranteed to be a winning bid? For example, the 
adversary can submit bids to r and 2s — r (for some r not known to the adversary). This is not 
ruled out by NM codes! 

Towards that end, we introduce a stronger notion which we call one-many NM codes. Intuitively, 
this guarantees the following. Consider the set of codewords output by the adversary. We require 
that even the joint distribution of the encoded value be independent of the value encoded in the 
input. A formal definition is given below: 

Definition 1.7 (One-Many Non-malleable codes). A eoding scheme (Enc, Dec) with bloek length 
n and message length k is a non-malleable code with respect to a family of tampering functions 
T C (Tyff and error e if for every {fi,...ft) G T, there exists a random variable Dj- on ({0, 
1}^ U {same*})* which is independent of the randomness in Enc such that for all messages s G {0, 
1}*^, it holds that 

|(Dec(/i(A)),... ,Dec(/t(A))) - replace(i:)^-, s)| < e 
Where X = Enc(s). We refer to t as the tampering degree of the non-malleable eode. 

We argue that one-many non-malleable codes is a basic notion which is interesting to study 
independent of concrete applications. However later we point out some concrete applications to 
non-malleable secret sharing (where one wishes to store multiple secrets), and, to witness signatures. 

An expert in cryptography by now would have noticed this is analogous to the well studied notion 
of one-many non-malleable commitments in the literature. Even though both notions deal with 
related concerns, we note non-malleable codes and non-malleable commitment are fundamentally 
different objects with the latter necessarily based on complexity assumptions. To start with, we 
prove a simple impossibility result for one-many non-malleable codes (whereas for one-many non- 
malleable commitments, a corresponding positive result is known [PROS] ). 

Lemma 1.8. One-many non-malleable codes which work for any arbitrary tampering degree and 
e < 1/4 cannot exist for a large class of tampering functions. 

Proof. The class of tampering functions which we consider are the ones where each function is 
allowed to read any one bit Xi of its choice from the input code X, and output a fresh encoding 
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of Xi. Most natural tampering functions (including split state ones |DPW10] [CG14a] l considered 
in the literature fall into this class. Assume that the encoded value s has at least 4 possibilities 
(length 2 bits or higher). The case of a single bit s is discussed later. 

Recall that n is the length of the code. We set t = n. Let X = Enc(s) be the input codeword 
where s is chosen at random. We consider n tampering functions where T) simply reads Xi and 
outputs a fresh encoding Wi = Enc(Aj). Now consider (Dec(/i(A)),..., Dec(/n(A))). Observe 
that this is exactly the bits of the string X. If the distinguisher applies the decode procedure on 
X, it will recover s. Now consider any possible output (di,..., dn) of Dp Now note that there 
cannot exist di which is same*. This is because otherwise it will be replaced by s (see Definition 
[LTD which is at least 2 bits while Dec(lTj) is just a single bit. This in turn implies that the value 
replace (Dj-, s) (from Definition 11.71) is independent of s and X. Thus a distinguisher (given access 
to s) can easily have an advantage exceeding e. 

Eor a single bit s, we modify our tampering functions to encode two bits: Wi = Enc(Ai||0). 
Then again we can argue that neither of di will be same* since then it will be replaced by s which 
is only one bit. This in turn again implies that replace(D^, s) is independent of s and X. This 
concludes the proof. 

□ 

We also introduce a natural generalization which we call many-many non-malleable codes. This 
refers to the situation where the adversary is given multiple codewords as input. 

Definition 1.9 (Many-Many Non-malleable codes). A coding scheme (Enc, Dec) with block length 
n and message length k is a non-malleable code with respect to a family of tampering functions 
T C (Tyff and error e if for every {fi,...ft) G D, there exists a random variable Dp on ({0, 
1}^ U {same*i}i^^u]Y which is independent of the randomness in Enc such that for all vector of 
messages (si,..., Su), Si G {0,1}^, it holds that 

|(Dec(/i(A)),... ,Dec(/t(A))) - replace(D^-, (si,... ,s«))| < e 

Where Xi = Enc(sj) and X = (Ai,... , Xu) 

Lemma 1.10. One-many non-malleable codes with tampering degree t and error e are also many- 
many non-malleable codes for tampering degree t and error ue (where u is as in Deftnition \1.9\) . 

Proof. This proof relies on a simple hybrid argument and the fact that all sources Xi,..., Xu are 
independent. We only provide a proof sketch here. Assume towards contradiction that there exists 
a one-many code with error e, which, under the many-many tampering adversary has error higher 
than u.e. That is, the adversary (/) is given as input (Ai,..., Xu) which are encodings of (si,..., 
Su) respectively. This is referred to as the hybrid 0. Now consider the following hybrid experiment. 
In the i-th hybrid experiment, the code Aj is changed to be an encoding of 0 (as opposed to be an 
encoding of s,). We claim that in this experiment, the error changes by at most e. This is because 
otherwise we can construct a one-many tampering adversary with error higher than e. To construct 
such an adversary (/*), each /j has A^^^j hardcoded in it and takes A* as input. This would show 
an adversary against which one-many non-malleable codes have an error higher than e. 

By the time we reach (u — l)-th hybrid experiment, the error could only have reduced by at 
most {u — l)e. However in the {u — l)-th hybrid experiment, the error can at most be e since it 
corresponds to the one-many setting. Hence, the error in the hybrid 0 could have been at most u.e. 
This concludes the proof. 

□ 
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Non-malleable codes in the split-state model An important and well studied family of 
tampering functions (which is also relevant to the current work) is the family of tampering functions 
in the C-split-state model, for C > 2. In this model, each tampering function / is of the form 
(/i ,... ,fc) where fi G 7'n/C) and for any codeword x = (xi,..., xc) G ({0, 1}^/^)^ we define (/i, 
..., fc){xi,..., Xc) = (/i(xi),..., fc{xc))- Thus each fi independently tampers a hxed partition 
of the codeword. Non-malleable codes in this model can also be viewed as non-malleable secret 
sharing. This is because the strings (xi,... , xc) can be seen as the shares of s and tampering each 
share individually does not allow one to “maul” the shared secret s. 

There has been a lot of recent work on constructing explicit and efficient non malleable codes 
in the C-split-state model. Since C = 1 includes all of the best one can hope for is C = 2. A 
Monte-Carlo construction of non-malleable codes in this model was given in the original paper on 
non-malleable codes |DPW10] for C = 2 and then improved in |CG14a] . However, both of these 
constructions are inefficient. For C = 2, these Monte-Carlo constructions imply existence of codes 
of rate close to ^ and corresponds to the hardest case. On the other extreme, when C = n, it 
corresponds to the case of bit tampering where each function fi acts independently on a particular 
bit of the codeword. By a recent line of work |DK013| |ADL14] |CG14b| |GZ14] [ADK015] . we 
now have almost optimal constructions of non-malleable codes in the C-state-state model, for any 
C>2. 

We remark that all the prior work in the information theoretic setting has only considered the 
construction of what we call one-one NM codes in the split state modeo That is, the adversary 
is only given as input a single codeword and outputs a single codeword. Our new notions seek to 
remedy that fact. 


Many-many non-malleable secret sharing. Gonsider the example of non-malleable secret 
sharing. What if there are shares of multiple secrets which the adversary can tamper with? What 
if the adversary is allowed to output shares of multiple secrets? For example, say there are two 
secret and two devices. Each device stores one share of each of the secrets. Say that an adversary 
is able to tamper with the data stored on each device individually (or infect each of them with a 
virus). Then, the current notion of one-one NM codes does not rule out a non-trivial relationship 
between two resulting secrets and the two original secrets we start with. It is conceivable that 
what we need here is a two-two non-malleable secret sharing. Our many-many non-malleable codes 
directly lead to such a many-many non-malleable secret sharing. 

Subsequent Work. The notion of one-many non-malleable codes has been used to construct 
witness signatures [GJK15] . Very roughly, witness signatures allow any party with a witness to 
some NP statement, to sign a message such that anyone can verify that the message was indeed 
signed using a valid witness to the NP statement. On the other hand, the signatures should still be 
unforgeable: that is, producing a signature on a new message (even given several message, signature 
pairs) should be as hard as computing a witness to the NP statement itself. There is no setup or 
any key generation involved. Witness signatures can be seen as an analogue of the notion of witness 
encryption [GGSWJ^ for signatures. The notion of witness signatures was introduced in [G.IK15] 
who nsed one-many non-malleable codes to propose a construction in the tamper proof hardware 
model. The fact that non-malleable codes satisfy one-many security (as opposed to just one-one) 
was crucial in their work. 

notion called as continuous non-malleable codes was considered in | FMNV14] where a codeword is tampered 
multiple times, but the experiment stops whenever an error message is encountered. The constructions provided for 
continuous non-malleable codes are in the computational setting. We discuss this in Section 11.51 
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1.2 Summary of results 


Our first main result is an explicit construction of a (2, t)-seedless non-malleable extractor. We note 
that prior to this work, such a construction was not known for even t = 1 and full min-entropy. 

Theorem 1. There exists a constant 7 > 0 such that for all n > 0 and t < , there exists an 

efficient seedless {2,t)-NM extractor at min-entropy n — n'^ with error and output length 

m = . 

Next, we show that it is possible to efficiently sample almost uniformly from the pre-image of 
any output of this extractor. We prove this in Section [H Combining this with Theorem 15.11 and a 
hybrid argument, we immediately have the following result. 

Theorem 2. There exists a constant 7 > 0 such that for all n > 0 and t < , there exists an 

efficient construction of one-many non-malleable codes in the 2-split state model with tampering 
degree t, relative rate and error 

We next improve the min-entropy rate requirements of seeded non-malleable extractors. As 
mentioned above, prior to this work, the best known such construction worked for min-entropy rate 
0.499 |Tn2bj . We have the following result. 

Theorem 3. There exists a constant c such that for all n > 0 and e > 0, and k > clog^ there 
exists an explicit construction of a seeded non-malleahle extractor snmExt : {0,1}” x {0, —)• {0, 
1}™, with m = ^}{k) and d = O (log^ (f)) ■ 

We in fact have a more general result, and can handle t-adversarial functions in the seeded 
non-malleable case as well, which improves a result of [CB,S14j . We refer the reader to Section [3 
for more details. 

Combined with the protocol developed in |DW09] . this immediately gives the following result 
about privacy amplification, which matches the best known result in |Lil5aj but has a simpler 
protocol. 

Theorem 4. There exists a constant C such that for any e > 0 with k > C(logre -|- log(l/e))^, 
there exists an explicit 2-round privacy amplification protocol with an active adversary for (n, k) 
sources, with security parameter log(l/e) and entropy loss 0(logn -|- log(l/e)). 

1.3 Other possible approaches to construct one-many non-malleable codes 

Since a major part of this paper is devoted to constructing explicit seedless (2, t)-non-malleable 
extractors (and providing efficient sampling algorithms for almost uniformly sampling from the pre¬ 
image of any output), and one of the major motivation for such explicit extractors is to construct 
one-many non-malleable codes in the 2-spht-state model, a natural question is whether existing 
constructions of non-malleable codes in the 2-split state model can be modified to satisfy the 
stronger notion of one-many non-malleable codes. 

Our first observation is that not every construction of a one-one non-malleable code satisfies 
the stronger notion of being a one-many non-malleable code. Intuitively this is because, say Enc 
and Dec are the encoding and decoding function of some non-malleable code against some class of 
tampering functions T. Thus, for /i, /2 G T, and any message m, Dec(/i(Enc(m))) is close to Df^, 
i = 1,2. But it is possible this does not rule out the possibility that, for instance Dec(/ 2 (Enc(m))) = 










Dec(/i(Enc(m))) + m + 1. Clearly, this code is not non-malleable against two adversaries from T^ 
and hence is not one-many. 

We now take a specific example. Suppose C is a one-one NM code against 2-split-state adver¬ 
saries. We construct another code C where the message m is broken into nii and m 2 using additive 
secret sharing. Then one encodes both nii and m 2 separately using the encoder of C (and includes 
both as part of the code, each encoding being equally divided in two halves). It is easy to show 
that C is still one-one NM. 

On the other hand, if the two adversaries act in the following way: one adversary can take the 
encoding of mi and put an encoding of 1 on his own. That will be the first output code (to message 
mi -|- 1). Next, the other adversary can take the encoding of m 2 and put an encoding of 0 on its 
own. That will be second output code (to message m 2 ). Now it can be seen that the two output 
code sum to m -|- 1. Thus, C is not one-many in the 2-split-state model. 

It turns out that existing constructions of non-malleable codes in the split-state model either 
fail to satisfy stronger notion of one-many, or at least it appears non-trivial to generalize the proofs 
of non-malleability against multiple adversaries. We briefly discuss these approaches, and why 
it appears non-trivial to extend them to handle multiple adversaries. A first approach could be 
to generalise the reductions in the recent work of Aggarwal et al. [ADK015j . and possibly show 
that one-many non-malleable codes in the 2-split-state model can be reduced to the problem of 
constructing one-many non-malleable codes in the bit-tampering model. However, each known 
construction of a NM code in the bit-tampering model [CG14b| jAGM~*~ 14 follows the general 
approach of starting out with an initial non-malleable code in the 2-split-state model (which is also 
an NM code against bit-wise tampering) of possibly low rate, and then amplifies the rate to almost 
optimal. Thus, it is not clear how to use this approach to construct one-many NM codes in the 
2-split-state model. 

Another approach could be to show that the non-malleable codes constructed by Aggarwal et 
al. |ADL14| generalize to handle many adversaries. However, from a careful examination of their 
proof it turns out that it is crucially used that the inner product function is an extractor for weak 
sources at min-entropy rate slightly greater than It turns out that this fact is tailor made for 
exactly one adversary, and for handling t > 1 adversaries one needs that the inner product function 
is an extractor for min-entropy rate approximately which is not true. Thus, it is not clear how 
to extend their approach as well. 

Finally, a third approach could be to extend the construction of the seedless non-malleable 
extractor for 10 sources in the work of Ghattopadhyay and Zuckerman [GZ14] . However, from 
a careful examination of the proof it follows that the crucial step of first constructing a seedless 
non-malleable condenser based on a sum-product theorem fails to generalize when there are more 
than one adversary. 

Thus, it appears that our new explicit constructions of seedless (2, t)-non-malleable extractors 
are a necessity for constructing one-many non-malleable codes in the split-state model. 


1.4 Related work on privacy amplification 

As mentioned above, seeded non-malleable extractors were introduced by Dodis and Wichs in 
|DW09| . to study the problem of privacy amplification with an active adversary. 

The goal is roughly as follows. We pick a security parameter s, and if the adversary Eve 
remains passive during the protocol then the two parties should achieve shared secret random bits 
that are 2“®-close to uniform. On the other hand, if Eve is active, then the probability that Eve 
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can successfully make the two parties output two different strings without being detected is at most 
2“®. We refer the readers to [DLWZl^ for a formal definition. 

Here, while one can still design protocols for an active adversary, the major goal is to design a 
protocol that uses as few number of interactions as possible, and output a uniform random string 
R that has length as close to Hoc{X) as possible (the difference is called entropy loss). When the 
entropy rate of X is large, i.e., bigger than 1/2, there exist protocols that take only one round 
(e.g., |MW97| . [DKRS06] i. However these protocols all have very large entropy loss. On the other 
hand, [DWOQ] showed that when the entropy rate of X is smaller than 1/2, then no one round 
protocol exists; furthermore the length of R has to be at least 0{s) smaller than HooiX). Thus, 
the natural goal is to design a two-round protocol with such optimal entropy loss. There has been 
a lot of effort along this line |MW97] . |DKRSn6] . |DWn9] . |RWn3] . (KR09], [(IKORinj . [niWZId] . 
|(1RS14| . [m^ . |Lil2bj . However, all protocols before the work of [DLWZl^ either need to use 
0{s) rounds, or need to incur an entropy loss of O(s^). 

In |DW09| , Dodis and Wichs showed that the previously defined seeded non-malleable extractor 
can be used to construct 2-round privacy amplification protocols with optimal entropy loss. They 
further showed that seeded non-malleable extractors exist when k > 2m -|- 31og(l/e) -|- logd -|- 9 
and d > log(n — /c -|- 1) -|- 21og(l/e) -|- 7. However, they were not able to construct such extractors. 
The first explicit construction of seeded non-malleable extractors appeared in |DLWZ1^ . with 
subsequent improvements in [CR.S14| . |Lil2aj . |DY13] . However, all these constructions require 
the entropy rate of the weak source to be bigger than 1/2. In another paper, Li [Lil2b] gave 
the Hrst explicit non-malleable extractor that breaks this barrier, which works for entropy rate 
1/2 — 5) for some constant 5 > 0. This is the best known seeded non-malleable extractor to date. 
Further, |Lil2bj also showed a connection between seeded non-malleable extractors and two-source 
extractors, which suggests that constructing explicit seeded non-malleable extractors with small 
seed length for smaller entropy may be hard. 

In a different line of work, Li [Lil2a] introduced the notion of non-malleable condenser, which is 
a weaker object than seeded non-malleable extractor. He then constructed explicit non-malleable 
condensers for entropy as small as k = polylog(n) in [Lil5aj and used them to give the first 
two-round privacy amplification protocol with optimal entropy loss, subject to the constraint that 
k > s^. 

1.5 Related work on non-malleable codes 

We give a summary of known constructions of non-malleable codes. As remarked above, all known 
explicit constructions of non-malleable codes in the information theoretic setting are in framework 
of what we call as one-one non-malleable codes. That is, the adversary is only given as input a 
single code and outputs a single code. 

Since the introduction of non-malleable codes by Dziembowski, Pietrzak and Wichs [DPWin] . 
the most well studied model is the C-split-state model introduced above. By a recent line of 
work |DK()1.3j |ADL14j |CQ14bj \C7X4] |ADK()15j . we now have almost optimal constructions of 
non-malleable codes in the C-state-state model, for any C >2. 

In the model of global tampering, Agrawal et al. lAGM"*" 14] constructed efficient non-malleable 
codes with rate 1 — o(l) against a class of tampering functions slightly more general than the family 
of permutations. 

A notion related to the many-many setting we consider in this work, called as continuous non- 
malleable codes, was introduced by Faust et al. [FMNV14] . In a continuous non-malleable code. 
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the codewords was allowed to be tampered multiple times, but the tampering experiment stops 
whenever an error message is detected. Thus this model is weaker than the notion we consider. The 
constructions provided of continuous non-malleable codes in |FMNV 14] are under computational 
assumptions. 

There were also some other conditional results. Liu and Lysyanskaya constructed effi¬ 

cient constant rate non-malleable codes in the split-state model against computationally bounded 
adversaries under strong cryptographic assumptions. The work of Faust et al. [FMVWl^ con¬ 
structed almost optimal non-malleable codes against the class of polynomial sized circuits in the 
CRS framework. |CCP12] , [CCFPll] , [CKMllj , and |FMNV 14] considered non-malleable codes in 
other models. 

The recent work of Chandran et al. |CGM^15] found interesting connections between non- 
malleable codes in a model slightly more general than the split-wise model and non-malleable 
commitment schemes. 

Organization 

We give an overview of all our explicit constructions in Section [2j We introduce some preliminaries 
in Section [3l and formally define seeded and seedless non-malleable extractors in Section HI We use 
Section [5] to present the connection between seedless (2, f)-non-malleable extractors and one-many 
non-malleable codes in the 2-split-state model. In Section [6l we present an explicit construction of a 
seedless (2, f)-non-malleable extractor. An explicit construction of a seeded non-malleable extractor 
construction at poly logarithmic min-entropy is presented in Section [71 Finally, we use Section [8] to 
give efficient encoding and decoding algorithms for the resulting one-many non-malleable codes. 


2 Overview of Our Constructions 

In this section, we give an overview of the main ideas involved in our constructions. The main 
ingredient in all our constructions is an explicit seedless (2, t)-non-malleable extractor. Further, we 
give efficient algorithms for almost uniformly sampling from the pre-image of any output of this 
extractor. The explicit construction of many-many non-malleable codes in the 2-spht state model 
with tampering degree t then follow in a straightforward way using the connection via Theorem 

EH 

It turns out that by a simple modification of the construction of our seedless non-malleable 
extractor, we also have an explicit construction of a seeded non-malleable extractor which works 
for any min-entropy k > log^ n. We will give an overview of how to achieve this as well. 

2.1 A Seedless (2, t)-Non-Malleable Extractor 

Let 7 be a small enough constant and C a large enough one. Let t = . 

We construct an explicit function nmExt : {0,1}" X {0,1}" —)• {0,1}"*, m = which satisfies 
the following property: If X and Y be independent (n,n — n'>')-sources on {0,1}"', and A\ = (/i, 
gi),..., At = {ft, gt) are arbitrary 2-split sate tampering functions such that for any z € [t], at least 
one of fi or gt has no hxed points, the following holds: 

I nmExt {X,Y) o nmExt (Ai(A,y))o... nmExt {At{X,Y)) — 

Um o nmExt(Ai(A, T)) o ... nmExt(At(X, y))| < e. 
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where e = 2 . 

By a convex combination argument fLemma l6.14p . we show that if nmExt satishes the property 
above, then it is indeed a seedless (2, t)-non-malleable extractor fDehnition l4.4p . 

We introduce some notation. 

Notation: For any function H, and V = H{X,Y), we use to denote the random variable 

H{Ai{X, Y)). If Za, Za+i, ■ ■ ■ ,Zi, are random variables, we use to denote the random variable 

{Za ,..., Zf,). For any bit string z, let denote the symbol in the h’th co-ordinate of x. For a 
string X of length m, and T C [m], let x^rpj be the projection of x onto the co-ordinates indexed 
by T. For a string x of length m, dehne the string Shce(x, w) to be the prehx of x with length w. 

The high level idea of the non-malleable extractor is as follows. Initially we have two independent 
sources {X,Y) and t tampered version {Ai{X,Y)}, which can depend arbitrarily on {X,Y). We 
would like to gradually break the dependence of {Ai{X, F)} on (X, Y), until at the end we get an 
output nmFxt(W, y) which is independent of all {nmFxt(^j(X, F))}. 

Towards this end, we would first like to create something from (X, Y) that can distinguish from 
{Ai{X,Y)}. More specifically, we will obtain a small string Z of length from (W, F), such 

that with high probability Z is different from all obtained from {Ai{X,Y)}. Next, we will 

run some iterative steps of extraction from (X, F), with each step based on one bit of Z. The 
crucial property we will have here is that whenever we reach a bit of Z which is different from 
the corresponding bits of {Z^^\i G S'} for some subset S C [t], in that particular step the output 
of our extraction from (X, F) will be (close to) uniform and independent of all the corresponding 
outputs obtained from {Xj(X, F),i € S}. Furthermore this will remain true in all subsequent 
steps of extraction. Therefore, since Z is different from all {Z^^\z G [t]}, we know that at the end 
our output nmFxt(X, F) will be independent of all {nmFxt(Mi(X, F)), z G [t]}. We now elaborate 
about the two steps in more details below. 

Step 1: Here we use the sources X and F to obtain a random variable Z, such that for each 
i G [t], Z A with probability at least 1 — Thus by a union bound with probability 

1 — 2“®® we have that Z is different form all {Z^^\i G [t]}. To obtain Z, we hrst take two small 
slices Xi and Fi from the sources X and F respectively, with size at least ‘irY] and use the strong 
inner product 2-source extractor IP to generate an almost uniform random variable V = IP(X, 
F). Now we take an explicit asymptotically good binary linear error correcting code, and obtain 
encodings {E{X), E{Y)) of (X, F) respectively. We now use V to pseudorandomly sample 
bits from E{X) to obtain X 2 , and we do the same thing to obtain Y 2 from E(Y). We use known 
constructions of an averaging sampler Samp [Zuc97| |Vad04] (see Dehnition 18.41) to do this (in fact, 
we can even sample completely randomly since V is close to uniform). 

Now dehne 

Z = Xi o Fi o X 2 o F 2 . 

The length of Z is bits for some small constant (3. Fix some i G [t]. We claim that Z A 

with probability at least 1 — 2“®® . To see this, assume without loss of generality that fi has no 

hxed points. If Xi A then we have Z A Now suppose Xi = x|*^ and 

Fi = f/*\ Thus, V = We hx Xi, since IP is a strong extractor iTheorem I3.17p . V is still 

close to uniform and now it is a function of F, and thus independent of X. Since X A X^®^ by the 
property of the code, we know that E{X) and E{X^^')) must differ in at least a constant fraction of 
co-ordinates. Thus, if we uniformly (or pseudorandomly) sample bits from these coordinates, 
then with probability I — 2“®® the sampled strings will be different. 

We can now hx Z, {Z^®) : i G [t]}, such that Z A for any i. Since the size of each Z^®) is 
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small, we have that conditioned on this fixing, the sources X and Y are still independent and have 
min-entropy at least n — 0{ti) each (with high probability). 

Step 2: Here our goal is to gradually break the dependence of {Ai{X, H)} on {X, H), until at the 
end we get an output nmExt(X, H) which is independent of all {nmExt(^j(X, H))}. To achieve 
this, our crucial observation is that while many other techniques in constructing non-malleable 
seeded extractors (such as those in [DLWZl^ . |CR.S14| . |Lil2b| fail in the case where both sources 
are tampered, the powerful technique of alternating extraction still works. Thus, we will be relying 
on this technique, which has been used a lot in recent studies of extractors and privacy amplification 
|DW09] . [Lil2a] . [Lil2bj . [Lil3b] . [Lil3a| . |Lil5b| . We now briefly recall the details. The alternating 
extraction protocol is an interactive protocol between two parties, Quentin and Wendy, using two 
strong seeded extractors Extg, Ext^. Assume initially Wendy has a weak source X and Quentin 
has another source Q and a short uniform random string S'lO Suppose that X is independent of 
{Q,Si). In the first round, Quentin sends Si to Wendy, Wendy computes Ri = Extiu(A, Si) and 
sends it back to Quentin, and Quentin then computes S 2 = Extq{Q, Ri). Continuing in this way, in 
round i, Quentin sends Si, Wendy computes the random variables Ri = Ex.tw{X, Si) and sends it 
to Quentin, and Quentin then computes the random variable Sj+i = Extq((5, Ri). This is done for 
some u steps, and each of the random variables Ri, Si is of length m. Thus, the following sequence 
of random variables is generated: 

Si,Ri = Ext^„(A, Si),S 2 = Extq(Q,iii ),... ,Su = EyitgiQ, Ru-i), Ru = Ext^(X, S„). 

Also dehne the following look-ahead extractor: 

laExt(X, {Q,Si)) = Ri,...,Ru 

Now suppose we have t tampered versions of X: ... ,X^^\ which can depend on X arbitrar¬ 
ily; and t tampered versions of {Q, Si): (Q^^^ <5^^^), ..., {Q^^\ which can depend on [Q, Si) 

arbitrarily. Let laExt(A, {Q, Si)) = Ri,..., Ru, and for h G [t], let laExt(A(^\ ^i^'^)) = 

..., Rf^^ . As long as {X, ,... , A^*)) is independent of ((Q, Si), (Q^^\ •S'l^^), ■ ■ ■, Si^)) and 

t, u, m are small compared to the entropy of A and Q, one can use induction together with standard 
properties of strong seeded extractors to show that the following holds: for any j G [u], 

i G [i - l],h G [t]},{(QW,Sf)) : h G [t]} 

« Um, {i?f ^ f G [j - 1], h G [t]}, {(QW, Sf)) : h G [t]} 

Based on this property, we describe two different approaches to achieve our goal in Step 2. The 
hrst approach was our initial construction, while the second approach is inspired by new techniques 
in a recent work of Cohen |Cohl5] . It turns out the second approach is simpler and more suitable 
for our application to many-many non-malleable codes, thus we only provide the formal proof for 
the second approach in this paper (see Section [6]). Recall that the high level idea in both approaches 
is that we will proceed bit by bit based on the previously obtained string Z, which is different from 
all G [t]}. Whenever we reach a bit of Z which is different from the corresponding bits of 

{Z^^\i G S} for some subset S C [t], in that particular step the output of our extraction from 
{X,Y) will be (close to) uniform and independent of all the corresponding outputs obtained from 
{Ai(X, Y),i G S}. Furthermore this will remain true in all subsequent steps of extraction. We will 
achieve this by running some alternating extraction protocol for i times, where i is the length of 

®In fact, Si can be a slightly weak random source as well. 
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Z. Each time the alternating extraction will be between X and a new <51,h) obtained from E, 

where we take <Si,/i to be a small slice of Qh- 

Construction iQ Our first approach is based on a generalization of the techniques in [Li 13a] . 
Here we fist achieve an intermediate goal: whenever we reach a bit of Z which is different from 
the corresponding bits of {Z^'^\i G 5} for some subset <5 C [t], the output of our extraction from 
(X, Y) will have some entropy conditioned on all the corresponding outputs obtained from {Ai[X, 
y),i € S}. Suppose at step h (1 < h < .^) we have obtained Qh from Y (in the first step we can 
take a small slice of Y to be Qi) and use it to run an alternating extraction protocol with X. We 
run the alternating extraction for t + 2 rounds and obtain outputs Rh,i, ■ ■ ■ ,Rh,t+ 2 - The crucial 
idea is to use the /I’th bit of Z, to set a random variable Wh as either {Rh,i, ■ ■ ■ ,Rh,t+i) or Rh,t +2 
(appended with an appropriate number of O’s to make them the same length). 

Now consider the subset S C [t] where the h’th bit of Z is different from the h’th bit of {Z^^\ 

i G 5}. If Wh = {Rh,i, ■ ■ ■, Rh,t+i) then for all z G 5, we have = Rhl+ 2 - Since <5 has at most t 

elements, the size of is at most tm. Note that Wh has size (t + l)m and is close to uniform. 

Thus Wh has entropy roughly m conditioned on G <5} (here we can ignore the appended 

O’s in since they won’t affect the entropy in Wh)- On the other hand, if then Wh = Rh,t +2 

(0 (i) (i) 

then for all i G S', we have ' = Rj^\,, Rh^_^_i. By the property of alternating extraction we 
have that Wh is close to uniform conditioned on G S'}. 

We can now go from having conditional entropy to being conditional uniform, as follows. We Hrst 
convert Wh into a somewhere random source by applying an optimal seeded extractor and trying all 
possible choices of the seed. One can show that conditioned on previous random variables generated 
in our algorithm, Wh is now a deterministic function of X and thus independent of Y and Qh- We 
now take another optimal seeded extractor and use each row in this somewhere random source as 
a seed to extract a longer output from Qh- In this way we obtain a new somewhere random source. 
If we choose parameters appropriately we can ensure that the size of Wh is much smaller than the 
entropy of Qh, and thus the number of rows in this new somewhere random source is much smaller 
than its row length. Therefore, by using an extractor from [BRSWOG] we can use this somewhere 
random source to extract a close to uniform output 14 from X. Since Wh has entropy at least m 
conditioned on G 5}, as long as the size of 14 is small, using standard arguments one can 

show that 14 will be close to uniform conditioned on all ,i G S}. 

We now go into the next step of alternating extraction, where we will take a strong seeded 
extractor and use I 4 to extract a uniform string Qh+i from Y. We will then use X and Qh+i to 
do the alternating extraction for next step. The point here is that whenever we have I 4 is close 
to uniform conditioned on all G <5} for some S C [t], we can show that Qh+i is close to 

uniform conditioned on all {Q^h+iY ^ Thus in the next step of alternating extraction, we can 

first £x all {Qh+iA ^ *5}, and then fix all the {Rh].ij,i G ['5],j € [f + 2]}, and all the G 5} 

(these will now be deterministic functions of X). Conditioned on this fixing Qh+i is still close to 
uniform, and X still has a lot of entropy left (as long as the size of each R^h+i j '^h+i small). 

Therefore, in this step 14+i will be close to uniform even conditioned on all G 5}, i.e., once 

we have independence it will continue to hold in subsequent steps. Thus our goal is achieved. 

Construction 2: Here we replace our approach in Construction I with a more direct approach, 
by using the idea of “flip-flop” alternating extraction introduced in a recent paper by Cohen |CohI5] , 

^formal proofs of the claims in the sketch of Construction 1 are not provided in this paper. 
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which is again based on the techniques developed in |Lil3a] . Again, assume we are now looking at 
the /I’th bit of Z, and we have obtained Qh from Y. 

Now each step of alternating extraction will consist of two sub steps of alternating extraction, 
with each sub step taking two rounds. In the first sub step, we use X and Qh to perform an 
alternating extraction for two rounds and output Rh,i,Rh, 2 - If the /I’th bit of Z is 0, we take 
Vh = Rh,i', otherwise we take 14 = Rh, 2 - Now we will take a strong seeded extractor Ext and use 
14 to extract Qh = Ext(y, I 4 ) from Y. We then use Qh and X to perform the second sub step of 
alternating extraction, which again runs for two rounds and outputs Rh,i, Rh, 2 - Now if the h’th bit 
of Z is 0, we take Vh = Rh, 2 ] otherwise we take Vh = Rh,i- One can see that this is indeed in a 
“flip-flop” manner. 

The idea is as follows. Consider the /I’th bit of Z, and let S C [t] be such that for all z G S', we 
have Z^h} 7^ ^{h}' hlow consider the h’th step of alternating extraction. If Z^h} = 0) then in the 

first sub step of alternating extraction, I 4 = Rh,i] while for all i G S, we have I 4 ' = Rh 2 - Now 

it’s possible that I 4 depends on {V^'‘\i G S}, and thus Qh also depends on {Q^l\i G S}. However, 
when we go into the second sub step of alternating extraction, we will choose Vh = Rh,2] while for 
all z G S, we have Vj^ = Rj^i- Thus by the property of alternating extraction, we have that Vh 
close to uniform conditioned on all {Vj^',i G S}. 

On the other hand, if Z^h} — 1; then in the first sub step of alternating extraction, I 4 = Rh, 2 ] 

(0 (0 

while for all z G S, we have I4 = Rh i- Thus in this sub step, by the property of alternating 
extraction, we have that I 4 is close to uniform conditioned on all {V^^*\z G S}. Therefore we 
also have that Qh is close to un iform conditioned on all {Qh\'^ £ >5}, and they are deterministic 
functions of Y given I 4 and z G S}. Thus, when we go into the second sub step of alternating 

extraction, we can first fix all {Qj^ ,i G 5} and Qf^ is still close to uniform. Now all {V\ ,i G S} 
will be deterministic functions of X, and thus we can further fix them. As long as the size of each 
Rhj is small, conditioned on this fixing X still has a lot of entropy left. Therefore Qh can still 
be used to perform an alternating extraction with X, and this gives us that Vh = Rh,i is close to 

uniform. That is, again we get that Vh is close to uniform conditioned on all {Vh\i G 5}. 

Once we have this property, we can go into the next step of alternating extraction. We will 
now take a strong seeded extractor and use Vh to extract Qh+i from Y, and then use X and Qh+i 
to perform the next step of alternating extraction. Since Vh is close to uniform conditioned on all 
{V^h V £ 5'}, we also have that Qh+i is close to uniform conditioned on all G S}. Thus by 

the same argument above, we can first fix all G S'} and all {vl^^^,i G S}, and conditioned 

on this fixing Qh+i is still close to uniform. Therefore I 4+1 will be close to uniform conditioned 
on all G S}. Thus, going into the second sub step, we will also have that Qh+i is close 

to uniform conditioned on all {Q|^_^_l,i G 5}. Thus again we can first fix all {Qh+iV S S} and all 
{V^h+iV £ 5'}, and conditioned on this hxing Qh+i is still close to uniform. Therefore we get that 

Vh+i is close to uniform conditioned on all {V^h+iV € Sj, i.e., once we have independence it will 
continue to hold in subsequent steps. Thus our goal is achieved. 
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2.2 An Explicit Seeded Non-Malleable Extractor for Poly logarithmic Min-Entropy 

Let 7 > 0 be a small constant. For any e > 0, let A: > O (log^’*"'^ (^)), t < and d = 

O (t^log^(^)). We construct a function snmExt : {0,1}"" x {0,1}” —{0,1}™, m = O (log ( 7 )), 
such that the following holds: If A is a (n, /c)-source, Y is an independent uniform seed of length 
d, and , Ai are arbitrary functions with no fixed points, then the following holds: 

snmExt(A, E), snmExt(A, Ai(E)),... , snmExt(A, At{y)) 

Urn, snmExt(A, Ai{Y)), ..., snmExt(A, At{y)) 

We now describe our construction, which is essentially a simple modihcation of our seedless 
non-malleable extractor construction. 

Step 1 : Let Yi be a small slice of Y. Compute V = Ext(A, Li), where Ext is a strong seeded 
extractor. Now we use V to randomly sample bits from E{Y), where E is the encoder of an 
asymptotically good error correcting code. Let the sampled bits be Y 2 . We define 

^ = El o F 2 

We show that with high probability Z A for all i G [t]. We provide a brief sketch of the 
argument. Fix any i G [t]. If Ei A e/*\ then clearly Z A Z^'^\ Now suppose Ei = Y^^\ We fix 
El, and since Ext is a strong seeded extractor, it follows that V is still close to uniform, and is a 
deterministic function of X, thus independent of E, {E®, f G [t]}. Therefore V can be used to sample 

bits from E. Since Ai has no hxed points, it follows that E A Y^^\ Thus E{Y) and E{Y^'^'>) must 

ii) 

differ in at least a constant fraction of coordinates. Therefore with high probability E 2 A '^2 ■ 
a union bound, with high probability Z A for all i G [t]. 

Step 2 : As long as the size of (Ei, V, E 2 ) is small, we can show that conditioned on the hxing of 
these variables, X and E are still independent. Moreover both X and E only lose a small amount of 
entropy. Now we can use any of Construction 1 and Construction 2 above to finish the extraction. 
The same argument will show that at the end snmExt (A, E) will be close to uniform conditioned 
on all {snmExt(A, Aj(E)), z G [t]}. 

We refer the reader to Section [3 for more details. 

Comparison to the LCB in |Cohl5] Our second approach in constructing non-malleable 
two-source extractors is inspired by the work of [Cohl5| . Especially, we use the idea of “flip- 
flop” alternating extraction introduced there. However, there are also some differences between our 
construction and the “Local Correlation Breaker” constructed in |Cohl5| . which are worth pointing 
out. 

First, in our construction, both sources A and E are tampered. This results in t random 
variables A^^^ ..., A^*) that are arbitrarily correlated with A, and t random variables E^^^ ..., 
E^*) that are arbitrarily correlated with E. In contrast, in the case of Local Correlation Breaker 
constructed in |Cohl5] . there are only correlated random variables with one source, while the other 
source is not tampered. In this sense, our construction can actually be viewed as given a stronger 
version of the LCB. 

Second, the way to obtain a string that distinguishes the correlated parts is quite different. In 
the case of the LCB, one can simply use the index of each row in the somewhere random source. 

On the other hand, in our case we do not have such an index, since the only access we have are the 
two sources A and E. Thus, we have to take extra efforts to create such a string from these two 
sources, by using error correcting codes and random sampling. 
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2.3 Efficient Algorithms for Many-Many Non-Malleable Codes 


The above construction gives a (2, t) non-malleable extractor. However, for our application to 
constructing explicit many-many non-malleable codes, given any output of the extractor we need 
to efficiently sample (almost) uniformly from its pre-image. To do this using the construction 
described above is highly non-trivial. Therefore, in order to make it easy to efficiently sample from 
the pre-image of an output (i.e., “inverting” the extractor), we use additional ideas to modify the 
non-malleable extractor. We now briefly describe the main ideas that we use. Recall that t is the 
number of tampered versions of the sources, and I is the length of the string Z we obtained. 

Idea 1: Since our construction of the (2,t) non-malleable extractor involves multiple steps of 
alternating extraction, we need to first invert the extractors used in these steps. For this purpose, 
we will use linear seeded strong extractors in all alternating extraction steps. A linear seeded strong 
extractor is an extractor such that for any fixed seed, the output is a linear function of the input. 
Thus for any fixed seed, in order to sample uniformly from an output’s pre-image, we can just 
sample uniformly according to a system of linear equations, which can be done efficiently. 

Idea 2: Next, we will divide the sources X and Y into blocks. In each step of alternating 
extraction, we will also divide Qh and into blocks. Then, whenever we use an extractor to 
extract from X, or Q/j, we will use a completely new block of X, or Q;^. When we apply an 
extractor to Y to generate Qh or we will also use completely new blocks of Y to do this. This 
ensures that we do not have to deal with multiple compositions of extractors on the same string. 
That is, different applications of extractors are used on different parts of the inputs; so to invert 
them we can invert each part separately. Note that each alternating extraction takes at most 2 
rounds, so it suffices to divide Qh and Qh into two blocks. 

Here, we need to choose the parameters appropriately. Let the size of each Shj and Rhj produced 
in alternating extraction be roughly d, and the size of each block of Qh and Qh be Uq. Since in 
the analysis of each alternating extraction we need to fix 0{t) tampered versions of {Sh^j, Sh,j) and 
{Rhj, Rh,j), we need to have Uq > Q{td). Now in the analysis of the entire non-malleable extractor, 
we need to fix 0{t) tampered versions of Qh and Qf^, and 0{t£) tampered versions of {Shj, Sh,j)- 
The total size of this is 0{t£d). Thus we can take all t, i, d to be some small enough such that 
the total entropy loss of X and Y is some small Note that X and Y initially have almost full 

entropy. Therefore, we can divide X and Y into 0{£) blocks (or even 0{tt} blocks, for a reason we 
will explain below), such that even conditioned on the fixing of all {Sh,j-,Rh,jT Sh^j-,Rh,j^QhiQh} 
and all previous blocks, each block still has entropy rate say at least 0.9 (this can be achieved as 
long as njitt) S> t££). This ensures that each time we apply an extractor, we can use new blocks 
of X and Y. 

Idea 3: However, there is one issue with inverting a linear seeded extractor. The problem is 
that the pre-image size for different seeds may not be the same. For example, if we have a linear 
seeded extractor that outputs m bits from an re-bit input, then one can show that for most seeds 
the pre-image size is 2"'“™', while for some seed the pre-image size can be 2”. If we first generate 
the seed uniformly and then sample uniformly from the pre-image given each seed, then the overall 
distribution is not uniform over the entire pre-image, due to the above mentioned size difference. 
To rectify this, we construct a new linear seeded extractor iExt : {0,1}” x {0,1}'’* —>• {0,1}"* with 
rre = d/2 that works for entropy rate 0.9 sources. Moreover iExt has the property that given any 
output, for any fixed seed the pre-image size is the same. The idea is as follows. We first take O.ld 
bits from the seed and use an average sampler to sample 0.9d distinct bits from the source. Since 
we are using a sampler and the source has entropy rate 0.9, an argument in [Vad04] shows that with 
high probability conditioned on the 9.Id bits of the seed, the sampled 0.9d bits from the source 
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also has entropy rate roughly 0.9. Now we take the rest 0.9 bits of the seed and the sampled 0.9d 
bits from the source and apply the inner product two-source extractor (or just use leftover hash 
lemma), which can output d/2 uniform random bits. Now the point is that given any output and 
any fixed seed, the pre-image of the inner product part has the same size! and now the pre-image 
of any sampled bits also have the same size (since the pre-image is just the sampled bits adding 
any possible choice of the other n — 0.9d bits). 

Note that each time we apply iExt, the output length becomes half of the seed length. Thus 
in the alternating extraction if we start with seed length d, then after one sub step of alternating 
extraction, the output length will become fd(d) since the sub step takes at most 2 rounds. We will 
truncate the output if necessary to keep it to be the same length, no matter we choose Rh,i or 
Rh ,2 (since they have different sizes). Now we need to use this output to extract Qf^ or Qh+i from 
Y. Since the size of Qf^ or Qh+i is Q{td), we will take 0(t) new blocks from Y and apply iExt to 
them using the same seed, and then concatenate the outputs. Since the blocks of Y form a block 
source, and iExt is a strong seeded extractor, one can show that the concatenated outputs is close 
to uniform. We then do the same thing for the next sub step of alternating extraction. Since we 
need to repeat alternating extraction for 0{£) steps, we need to divide Y into 0{tt) blocks; while 
we can divide X into only 0{t} blocks. 

Idea 4: Now given any output, our sampling strategy is as follows. We first uniformly generate 
Xi and Yi, from whom we can compute V = IP(Xi, Yi). Then we know which bits of the codeword 
we are sampling. We then uniformly generate these sampled bits X 2 ,Y 2 and thus we obtain Z. 
Once we have Z, we will now uniformly generate all {Shj, Rh,j, Shj, Rh,j} produced in alternating 
extractions. Based on Z and these variables, we can now generate all the blocks of X used and all 
the {Qh^Qh} by inverting iExt. Finally, based on {Qh^Qh} we can generate all the blocks of Y 
used by again inverting iExt. 

This almost works except for the following problem. The blocks of X and Y generated must 
also satisfy the linear equations imposed by X 2 ,Y 2 , which are the bits sampled from the codewords 
of encodings of X and Y by using a linear error correcting code. However, it is unclear what is the 
dependence between the linear equations imposed by X 2 ,Y 2 and the other linear equations that we 
obtain earlier. Of course, if they are linearly independent then we are in good shape. 

To solve this problem, our crucial observation is that if I is small and the number of blocks is 
large enough (say we divide the rest of X into 0{€) blocks and the rest of Y into 0{ti) blocks for a 
large enough constant in 0 (-)), then the entire alternating extraction steps only consume say half of 
the bits of X and Y. Thus, whatever linear equations we obtain from these steps are only imposing 
constraints to say the first half bits of X and Y. Therefore, we can hope that the encodings of X 
and Y use all the bits of X and T, and thus the linear equations imposed by these encodings will 
be linearly independent of the equations we obtain from alternating extraction (i.e., the second half 
bits act as “free variables”). 

We indeed succeed with this idea. More specifically, we are going to divide the rest of X and 
Y (the parts excluding Xi and Yi, which has length n — into chunks of length h = [logn]. 

We will now view each chunk as an element in the held F 26 . We then take say 0.9n bits and view 
it as a string in F 0 . 9 n/b_ j^eed-Solomn code (RS-code for short) in F 26 to encode 

this string into a codeword in F^ . Note that 2^ > n > 0.9n/6, so this encoding is feasible, and it 
has distance rate (2^ — 0.9n/6)/(2^) > 0.9. Now, instead of using V = IP(Xi, Yl) to sample 
bits, we will sample held elements from the encoding of X and Y, and then view them as bit 
strings. Since the RS-code has distance rate 0.9, again we have that if two strings are different, then 

^Except when the seed is 0, but we can deal with this by adding a 1 to both the source and the seed. 
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with probability 1 — 2 , the sampled strings of their encodings will also be different. Moreover, 

the sampled bit string now has length roughly log n, which is still small enough. 

Now we can continue with our sampling strategy. As before we first generate all the blocks of X 
and Y used in all alternating extraction steps. This only consists of the first half bits of X and Y. 
Now, any fixing of these bits can be viewed equivalently as fixing the first 0.5n/b held elements in 
a message. Thus we are still left with OAnjb free field elements, and we have linear equations 
in F 26 according to the RS-code. As long as the number of free variables is larger than the number 
of equations (i.e., 0.4n/6 > the property of the RS-encoding ensures that this set of linear 

equations are linearly independent. Thus, for any hxed hrst half bits of X and Y, the pre-image 
according to the linear equations imposed by the sampled bits X 2 ,Y 2 has the same size. 

Summary. Now we are basically done. Again, given any output, our sampling strategy is as 
follows. We hrst uniformly generate Xi and Yf, from whom we can compute V = IP(Xi, Yf). Then 
we know which co-ordinates of the codeword we are sampling. We then uniformly generate these 
sampled bits X 2 ,Y 2 and thus we obtain Z. Once we have Z, we will now uniformly generate all 
{Sh,j, Rh,j, Rh,j, Rh,j} produced in alternating extractions. Based on Z and these variables, we can 
now generate all the blocks of X used and all the {Qh, Qh} by inverting iExt. Based on {Qh, Qh} 
can generate all the blocks of Y used by again inverting iExt. Einally, we use the linear equations 
imposed by X 2 , Y 2 to generate the rest of the bits in X and Y. 

To show that we are indeed sampling uniformly from the output’s pre-image, we will establish 
the following two facts. 

Fact 1: For any hxed Z = z, any choice of {shjAhj^'ShjXhj} gives the same pre-image size of 
{x,y). This follows directly from the fact that our linear seeded extractor has the same pre-image 
size for any seed, and the argument about the linear equations imposed by the RS-code above. 
Fact 2: For different Z = z, and different choice of the pre-image size of 

(x, y) is also the same. This follows because the “hip-flop” alternating extraction has a symmetric 
manner. More specihcally, no matter each bit of z is 0 or 1, we will use two sub steps of alternating 
extraction, with each step taking two rounds of alternating extraction. Thus by symmetry no 
matter each bit of z is 0 or 1, the pre-image size of the blocks of X and {qh,Qh} is the same. 
Moreover, although depending on the /I’th bit z, we may choose either or r/j 2 (or either r/j 1 
or we truncate them if necessary to the same size. So when we generate the blocks of Y 

using them and {qhjQh}^ the pre-image of the blocks of Y will also have the same size. Thus, the 
pre-image size of the blocks of X and Y used for this bit is the same. Therefore, for different Z = z 
and different the pre-image size is also the same. 

Now the conclusion that we are sampling uniformly from the output’s pre-image follows from 
the above two facts, and the observation that any (x, y) in the output’s pre-image produces exactly 
one sequence of z,{sh,j,rh,j,Sh,j,rh,j}. 


3 Preliminaries 

3.1 Notations 

We use capital letters to denote distributions and their support, and corresponding small letters to 
denote a sample from the source. Let [m] denote the set {1,2,... ,m}, and Ur denote the uniform 
distribution over {0,1}”. For a string x of length m, define the string Slice(x,rt;) to be the prefix 
of length w of x. For any i G [m], let x^jj denote the symbol in the i’th co-ordinate of x, and for 
any T C [m], let xij^} denote the projection of x to the co-ordinates indexed by T. 
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3.2 Min Entropy, Flat Distributions 

The min-entropy of a source X is defined to be H^oiX) = minsesupport(x) {l/log(Pr[X = s])}. A 
distribution (source) D is flat if it is uniform over a set S. A (n, A:)-source is a distribution on 
{0, !}"■ with min-entropy k. It is a well known fact that any (n, A:)-source is a convex combination 
of flat sources supported on sets of size 2^. 

3.3 Statistical Distance, Convex Combination of Distributions and Probability 
Lemmas 

Definition 3.1 (Statistical distance). Let Di and D 2 be two distributions on a set S. The statistical 
distance between Di and D 2 is defined to be: 

\Di -D 2 \ = max \D,{T) - D 2 {T)\ = i | Pr[Di = u] - Pr[D 2 = u]| 

“ seS 

Di is e-close to D 2 if \Di — D 2 I < e. 

Definition 3.2 (Convex combination). A distribution D on a set S is a convex combination of 
distributions Di,... ,Di on S if there exists non-negative constants (called weights) wi,... ,Wi with 
Yl\=i = 1 such that Pr[D = s] = Wi ■ Pr[Dj = s] for all s ^ S. We use the notation 

D = • Di to denote the fact that D is a convex combination of the distributions Di,... ,D£ 

with weights wi,... ,W(,. 

Definition 3.3. For random variables X and Y, we use X\Y to denote a random variable with 
distribution: Pr[(A|y) = x] = EyGsupport(y) Pr[y = y] ■ Pr[A = x\Y = y]. 

We record the following lemma which follows from the above definitions. 

Lemma 3.4. Let X and Y be distributions on a set S such that X = Ei=i ^ ~ 

ELi ■ Yi. Then |X - y| < E* Wi • |W - Yi\. 

3.4 Seeded and Seedless Extractors 

Definition 3.5 (Strong seeded extractor). A function Ext : {0,1}” x {0, ^ {0,1}™ is called a 

strong seeded extractor for min-entropy k and error e if for any (n, k)-source X and an independent 
uniformly random string Ud, we have 

\E^t{X,Ud)oUd-UmoUd\<e, 

where Um is independent of Ud- Further if the function Ext(-,tt) is a linear function over F 2 for 
every u G {0, l}*^, then Ext is called a linear seeded extractor. 

Definition 3.6 (Independent Source Extractor). A function lExt : ({0, !}"■)* ^ {0,1}™ is an 
extractor for independent {n,k) sources that uses t sources and outputs m bits with error e, if for 
any t independent (n, k) sources Xi, X 2 , ■'' Ait, we have 


\lExt{Xi,X2,---,Xt)-Um\<e. 

In the special case where t = 2, we say lExt is a two-source extractor. 
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3.5 Conditional Min-Entropy 

Definition 3.7. The average conditional min-entropy is defined as 


Hoo{X\W) = log niaxPr[X = x\W = it;] )= — log ill 


2-Hoo{x\w=w) 


We recall some results on conditional min-entropy from [DORSO^ . 


Lemma 3.8 f [DORS08] l. For any s > 0, Pr 


w^W 


H^iX\W = w)> H^{X\W) - s 


> 1 - 2 " 


Lemma 3.9 1 |DORS08] 1. If a random variable B can take at most I values, then HooiA\B) > 
Hoo{A) - i. 


It is sometimes convenient to work with average case seeded extractors, where if a source X has 
average case conditional min-entropy Hoo{X\Z) > k then the output of the extractor is uniform 
even when Z is given. 

Lemma 3.10 1 [DORS08] 1. For any 6 > 0, if Ext is a {k,e)-extractor then it is also a (fc-l-log (j) , 
e -|- (5) average case extractor. 


The following result on conditional min-entropy was proved in |MW97] . 

Lemma 3.11. Let X,Y be random variables such that the random variable Y takes at i values. 
Then 


Pr 


hmy = y)> Hoo{x) 


log i — log 



> 1 - e. 


We also need the following lemma from [Lil2b] . 

Lemma 3.12. Let X,Y be random variables with supports S,T CV such that {X,Y) is e-close to 
a distribution with min-entropy k. Further suppose that the random variable Y can take at most i 
values. Then 

> 1 - 2e^/^. 


Pr 

yr^Y 


{X\Y = y) is -close to a source with min-entropy k — \ogi — log f- 


3.6 Somewhere Random Sources 

Definition 3.13. A source X is a t x k somewhere random source if it comprises of t rows on 
{0,1}^ such that at least one of the rows is uniformly distributed. The rows may have arbitrary 
correlations among themselves. 

3.7 Some Known Extractor Constructions 

We use explicit constructions of strong linear seeded extractors |Tre01j |RRV02j . 

Theorem 3.14 f [Treni][RRVn2] l. For every n,k,m € N and e > 0 such that m < k < n, there 
exists an explicit linear strong seeded extractor LSExt : {0,1}" x {0, l}'^ —>■ {0,1}™ for min-entropy 
k, error e, and d = O • 
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The following is an explicit construction of a strong seeded extractor with optimal parameters 
[GUV09]. 

Theorem 3.15. For any constant a > 0, and all integers n,k > 0 there exists a polynomial time 
computable strong seeded extractor Ext : {0,1}” x {0,1}'^ —>■ {0,1}”* with d = 0(log n + log(i)) and 
m = (1 — a)k. 

We use the following strong seeded extractor constructed by Zuckerman |Zuc07] that achieves 
seed length log(n) + 0 (log(i)) to extract from any source with constant min-entropy. 

Theorem 3.16 f |Zuc07j l. For all constant a,(5, e > 0 and for all n > 0 there exists an efficient 
construction of a strong seeded extractor Ext : {0,1}” x {0, l}'^ —)• {0,1}™' with m > (1 — a)n and 
D = 2<^ = 0{n). 

We recall a folklore construction of a two-source extractors based on the inner product function 
|CG 88 ] . We include a proof for completeness. 

Theorem 3.17 f |CG 88 j ). For all m,r > 0, with q = 2^,n = rm, let Y he independent sources 
on Fg with min-entropy k\, k 2 respectively. Let IP be the inner product function over the field Fg. 
Then, we have: 

|IP(X, F), X - Um,X\ < e, |IP(X, Y),Y - U^, Y\ < e 

— +^2 —n — m) 

where e = 2 2 . 


Proof. Let X, Y be uniform on sets A,BY'Fg respectively, with |X| = and \B\ = 2^^. Let be 
any non-trivial additive character of the finite field Fg. For short, we use • to denote the standard 
inner product over Fg. We have 


< {\B\)2 

y£B xeA 


^ ^ f:((x-x') 
x,x'gA 



< \B\2 



- x') 



1 

2 


where the first inquality follows by an application of the Gauchy-Schwartz inequality. Further, 
whenever x ^ x', we have 

Tpiix-x') ■y) = 0. 

yeFj 

Thus, continuing with our estimate, we have 

^ 1 1 Ti+fci -f-ko 

\YAx- y)\ < {\A\q^)2 = 2 2 

y&B x£A 


Thus, 

n — k-i —ko 

Ey|Ex^(IP(X,y))| <2^^ 

Using Vazirani’s XOR Lemma (see [Rao07| for a proof), it now follows that 

n-\-m — k-\ —ko 

|IP(X, Y),Y - Um, Y\ < 2 -^- 

n+m — fc-| — ko 

It can be similarly shown that |IP(X, F),X — Um,X\ < 2 2 . □ 
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4 Seeded and Seedless Non-Malleable Extractors 


We give formally introduce seedless (2, t)-non-malleable extractors in this section. We first recall 
the definition of seeded t-non-malleable extractors from |CRS14| . which generalizes the definition 
introduced in [DWOQj . 

Definition 4.1 (t-Non-malleable Extractor). A function snmExt : {0,1}" x {0,1}'^ —>• {0,1}™' is 
a seeded t-non-malleable extractor for min-entropy k and error e if the following holds : If X is 
a source on { 0 , 1 }" with min-entropy k and Ai : { 0 , 1 }" ^ { 0 , 1 }",... ,At '■ { 0 , 1 }" —^ { 0 , 1 }" are 
arbitrary tampering function with no fixed points, then 

|snmExt(X, o snmExt(X, .4i(t/rf)) o ... o snmExt(X, .4^(17^)) ° Ud 
—Ura ° snmExt(X, o ... o snmExt(X,7lt([/rf)) o Ud\ <e 

where Um is independent ofUd and X. 

We now proceed to define seedless non-malleable extractors, which were introduced by Cher- 
aghchi and Guruswami in [CG14b| . 

We need the following functions. 


copy(a:, y) 


X if X A same* 
y if X = same* 


copy(*)((xi,... ,Xi), (yi,. .. ,yt)) = (copy(xi,yi),...,copy(xt,y*)) 

Definition 4.2 (Seedless Non-Malleable Extractor). A function nniExt : {0,1}" —)■ {0,1}™ is a 
seedless non-malleable extractor with respect to a class of sources A and a family of tampering 
functions T with error e if for every distribution X € A and every tampering function f a T, there 
exists a random variable Dxj on {0,1}™ U {same*} which is independent of the source X such 
that 

|nmExt(X) o nmExt(/(X)) - Um o copy(Dx,/, C/m)| < e 
where both Um’s refer to the same uniform m-bit string. 


When the class of tampering functions are 2-split-state, the definition of seedless non-malleable 
extractors specializes as follows. 

Definition 4.3 (Seedless 2-Non-Malleable Extractor). A function nmExt : {0,1}" x {0,1}" — 
{ 0 , 1 }™ is a seedless 2-non-malleable extractor at min-entropy k and error e if it satisfies the 
following property: If X and Y are independent {n,k)-sources and A = {f,g) is an arbitrary 2- 
split-state tampering function, then there exists a random variable Df^g on {0,1}™U {same*} which 
is independent of the sources X and Y, such that 

|nniExt(X, Y) o nmExt(7l(X, Y)) — Um o copy(Dj^g, Um)\ < e 

where both Um’s refer to the same uniform m-bit string. 


In this work, we introduce the following natural generalization where the sources X, Y are 
tampered by t tampering functions, each of which is from the 2-split-state family. 
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Definition 4.4 (Seedless (2,t)-Non-Malleable Extractor). A function nmExt : {0,1}" x {0,1}"' —>• 
{ 0 , 1 }™ is a seedless {2,t)-non-malleable extractor at min-entropy k and error e if it satisfies the 
following property: If X and Y are independent {n, k)-sources and Ai = {fi, gi),..., At = {ft, 
gt) are t arbitrary 2-split-state tampering functions, then there exists a random variable on 
({0,1}™ U {some*})* which is independent of the sources X and Y, such that 

|nmExt(X, y), nmExt(^i(X,y)),..., nmExt(A(-^, i^)) - copy^*^ Um)\ < e 

where both Um’s refer to the same uniform m-bit string. 

5 Non-malleable codes via Seedless non-malleable extractors 

The following theorem is a straightforward generalization of the connection found between non- 
malleable codes and seedless non-malleable extractors [CG14b] . 

Theorem 5.1. Let nmExt : ({0, l}”)^ —{0,1}™ be a polynomial time computable seedless {2,t)- 
non-malleable extractor for min-entropy n with error e. Then there exists a one-many non-malleable 
code with an efficient decoder in the 2-split-state model with tampering degree t, block length = 2n, 
relative rate and error = 

The one-many non-malleable codes in the 2-split-state model is define in the following way: 
Eor any message s G {0,1}™, the encoder Enc(s) outputs a uniformly random string from the set 
nmExt“^(s) C {0,1}^”. Eor any codeword c G {0,1}^”, the decoder Dec outputs nmExt(c). Thus 
for the encoder to be efficient, one need to sample almost uniform from nmExt~^(s). 


6 An Explicit Seedless (2, t)-Non-Malleable Extractor 

We first set up some tools that we use in our extractor construction. 


6.1 Averaging Samplers 


In our contsruction, we need to pseudorandomly sample a subset T in [n] such that it intersects 
any large enough subset with high probability. It turns out that a stronger sampling problem has 
been extensively studied with the following stronger requirement: Eor any function /:[??-]—)■ [0,1], 
the average of / on the sampled subset T is close to its actual mean with high probability. Such 
sampling procedures are known as averaging samplers. We use the definition from [Vad04] . 


Definition 6.1 (Averaging sampler [Vad04 ] ). A function Samp : {0,1}'" [n]* is a {fi,6,'y) 

averaging sampler if for every function f : [n] ^ [0,1] with average value ^X^j/(f) > n, it holds 


that 


Pr 




< 7 - 


Samp has distinct samples if for every x G {0,1}^, the samples produced by Samp(x) are all distinct. 


The following theorem proved by Zuckerman |Zuc97j essentially shows that seeded extractors 
are equivalent to averaging samplers. 
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Theorem 6.2 ([Zuc97j). Let Ext : {0,1}” x {0,1}'^ —{0,1}”^ be a strong seeded extractor for min- 
entropy k and error e. Let {0,1}” = {si,..., 52^}. Then Samp(x) = (Ext(x, si) o si,..., Ext(x, 52^) ° ^ 2 '^) 
is a (^,0,7) averaging sampler with distinct samples for any p, > 0 , 9 = e and 7 = 

Using known constructions of strong seeded extractors, we have the following corollary. 

Corollary 6.3. For any constants (^Samp, i^Samp > 0, there exist constants a,j3 < t'samp such that 
for all n > 0 and any r > there exists a polynomial time computable function Samp : {0, 

1}’' ^ [j 7 ,]*samp = 0{n^) satisfying the following property: for any set S C [n] of size fe am pn, 

Pr[|Samp(C/p) n 5| > 1] > 1 - 

Further Samp has distinct samples. 

Proof. We set the parameter a as follows. Let Ext : {0,1}”°' x {0, l}'^ —> {0,1}”^ be the strong linear 
seeded extractor for min-entropy k = ^ and error e = | from Theorem l3.14l Thus t = 2^ = 0(n“'^) 
for some constant c. We choose a < t'samp small enough such that ca < Vsamp (and set (3 = ca). 

The result now follows by using Theorem 16.21 □ 

6.2 Alternating Extraction 

We recall the method of alternating extraction, which we use as a crucial component in our con¬ 
struction. 

The alternating extraction protocol takes in two integer parameters u,m > 0. Assume that 
there are two parties, Quentin with a source Q and a uniform seed Si (which may be correlated 
with Q), and Wendy with source W. Eurther suppose that (Q, Si) is kept as a secret from Wendy 
and W is kept a secret from Quentin.The protocol is an interactive process between Quentin and 
Wendy, and runs for u steps. 

Let Extq, Ext^ be strong seeded extractors. In the first step, Quentin sends 5i to Wendy, 
Wendy computes Ri = Ext^(A, Si) and sends it back to Quentin, and Quentin then computes 
S 2 = Extg((5, i?i). Continuing in this way, in step i, Quentin sends Sj, Wendy computes the 
random variables Ri = Extu;(A, SQ and sends it to Quentin, and Quentin then computes the 
random variable S^+i = Extq(Q, Ri). This is done for u steps. Each of the random variables Ri, Si 
is of length m. Thus, the following sequence of random variables is generated: 

Si,i?i = Ext^(A, Si),S 2 = Extg{Q,Ri),... ,Su = Extg{Q, Ru-i), Ru = Ext^u(A, S„). 

Look-Ahead Extractor We define the following look-ahead extractor; 

laExt(A, {Q,Si)) = Ri,...,Ru 

In our application of the alternating extraction protocol, the initial seed Si is not guaranteed 
to be uniform but only has high min-entropj|^. We first prove a lemma which shows that strong 
seeded extractors work even when the seed is not uniform but has high enough min-entropy. 

Lemma 6.4. Let Ext : {0,1}” x {0,1}'^ —>■ {0,1}™ be a strong seeded extractor for min-entropy 
k, and error e. Let X be a (re, k)-source and let Y be a source on {0, with min-entropy d — X. 

Then, 

|Ext(A, Y)oY - UmoY\< 2^e. 

^another way to handle this is to use the extractor from |Raz05| . but we avoid this to ensure invertibility of the 
final extractor. 
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Proof. Since y is a source with min-entropy d — X, we can assume it is uniform on a set A of size 
2'^-^. Thus 

|Ext(y, Y)oY-UmOY\ = ^Yl y) - ^rn\ 

yeA 

- \P:>A{X,y)-Um\ 

yG{0,iy 

< -^2'^e = 2^e 

where the last inequality uses the fact that Ext is a strong seeded extractor. □ 


Notation: If Zq, Za+i; • • •, Zfe are random variables, we use Z[a,fe] to denote the random variable 

Zai ■ ■ ■ ) Zij. 

We now prove a general lemma which establishes a strong property satisfied by the alternating 
extraction protocol. The proof uses ideas from a result proved by Li on alternating extraction 
|Lil3a| . and in fact generalizes this result. 

Lemma 6.5. Let X be a {nu,,kw)-source and let X^^\ ... , X^^'^ be random variables on {0,1}”™ 
that are arbitrarily correlated with X. LetY = {Q, Si),Y^^^ = ..., yW = be 

arbitrarily correlated random variables that are independent of {X,X^^\X^‘^\ ... ,X^^^). Suppose 
that Q is a {nq,kq)-source, Si is a {m,m — X)-source, are each on Uq bits, and S^^\ 

are each on m bits. Let Extq,Extiu be strong seeded extractors that extract m bits at min- 
entropy k with error e and seed length m. Let laExt be the look-ahead extractor for an alternating 
extraction protocol with parameters u,m, with Extg,Ext^ being the strong seeded extractors used 
by Quentin and Wendy respectively. Let laExt(X,y) = Ri,...,Ru and for j G [t], laExt(X*^'^\ 
y^-^)) = r[^\ ..., Ru'^. If kyj, kq > k + u{t + l)m + 21og(i), then the following holds for each 
i G [ri].- 


Ri, ,..., , Q, ,..., qw u^, ,..., , q, ,..., qw 

where e* = 0{ue + 2'^e). 


Proof. We in fact prove the following stronger claim. 
Claim 6.6. For each i G [tt] the following hold: 



nd) q c(l) 

• • ’ -^[1,1-1]’ *^[14]’ *^[1,1]’ 

....s'L-o-o''’’ 

...,qW 

Ei Um, R[l^i—l], R^i i_i^i • 

r>d) q 0(1) 

• • ’ -^[1,1-1]’ 


...,qW 


qd) Tj pdi 

■ • ’ *^[1,1]’ -^[1.*]’ • • • 

,r[Lwww,.. 



qd) Tj pdi 

• • ’ *^[1,1]’ -^[1.*]’ • • • 

,r[LwwW,.. 



where = 4(z — l)e+2''‘e. Further, conditioned on ■ ■ ■, 'S’p Ij) ■ ■ ■ > 

(a) (X,,...,yW) is independent from {Y,Y^^\ ... ,Y^^'i), (b) X,Q each have average condi¬ 
tional min-entropy at least {u — i)(t + l)m + A: + 2 log (|) and (c) Ri, R^^\ ..., Rf^ are deterministic 
functions of {X, ,..., X^^'l). 
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Proof. We prove this claim by induction on i. 

Let i = 1. Since Ri = Ext«,(X, ^i), and Extiu is a strong-seeded extractor, it follows by Lemma 
16.41 that Extu,(X, Si), Si Riei Um, Si, where ei = 2^e. Thus we can fix Si, and Ri is still ei-close to 
uniform on average. We note that Ri is a deterministic function of X. Since the random variables 
..., s[^\q, ..., are deterministic functions of Y, ..., Y^^'> and thus uncorrelated 

with X, we have 




(i) 




We fix the random variables Si, ..., sf \ By Lemma (3.91 the source Q has average conditional 
min-entropy at least kq — m{t -|- 1) = k + {u — l)m{t -|- 1) -|- 2 log after this fixing. Using 
Lemma [3.101 it follows that Extg is a (A: -|- log (i) , 2e) strong average case extractor. We also note 

that Ri, R^\ ..., R^^ are now deterministic functions of X, X^^\ ... , X^^\ Thus recalling that 
S 2 = Extq(g,i?i), we have S 2 ,Ri ~( 2 e+ei) Um,Ri, since Ri is ei-close to uniform and using the 
fact that by Lemma [3T0] Ext^ is a (A; -|- log (i) , 2e) strong average case extractor. Thus on fixing 
Ri, S 2 is (2e -|- ei)-close to Um on average and is a deterministic function of Y. Since the random 
variables , • • •, Ri^ are deterministic functions of X, X^^\ ..., X^^\ we thus have 


c c c(^) c(^) p p(^) 

^.,,+2eUm,Si,S[^\sP,Rl,R^^\ 




Further, it still holds that {X, X^^\ ..., is independent from {Y,Y^^\ ... ,Y^^')). This 

proves the base case of our induction. 


Now suppose that the claim is true for i and we will prove it for i -|- 1. Fix the random variables 
5 -^[1 i_i] 5 • • •) 5 5'p j], , ■ ■ ■, S^ii] • By induction hypothesis, it follows that X, Q each 

have average conditional min-entropy at least {u — i)m{t -|- 1) -|- A: -|- 2 log (^) after this fixing. We 
now fix the random variables Ri, R^l\ ■ ■ ■, Rf^ (these random variables are deterministic functions 
of X,X'^^\... by induction hypothesis). Thus by Lemma 13.91 the source X has conditional 
min-entropy at least {u — i){t + l)m + k + 2 log (i) — (A + l)m = (u — i — l)(t -|- l)m -|- A: -|- 2 log (i) 
after this fixing. 


Since Si+i = Extg{Q, Ri) is now independent of X and (e* -|- 2e)-close to Um on average (by 
induction hypothesis), it follows that Extu;(X, S'i+i), 5j+i ~ei+ 4 e Um,Si+i. Thus on fixing 5j+i, 
Ri+i = Ext^(X, 5i+i) is (cj -|- 4e)-close to Um on average, and is a deterministic function of X. 
We also fix the random variables Since we have fixed the random variables rI^\ 

...,Rf\ thus 5,..., are deterministic functions of Y,Y^^\ ... ,Y^*'\ Hence Ri+i is still 
Cj+i-close to uniform on average and a deterministic function of X after this fixing. Thus, 


Ri+i, ! ■ ■ ■ 5 5 ‘^[14+1]5 *^[1,1+1]> ■ ■ ■) \ ) g*' ^ 

^ TI R, , rB) d(*) c c( 1) clb f) r){l) fi{t) 


The source Q has conditional min-entropy at least {u — i){t + l)m + k + 2 log (i) — (A + ^)m = 
{u — i — 1)(A -|- l)m + k + 2log (i). 

Recall that 5j+2 = Extq(g, Ri+i). Since Ext^ is a (A;-|-log (i) , 2e) strong average case extractor, 
it follows that Extq(g, Rj+i), iij+i ^ei+2+2e Um- Since the random variables R^^i, ■ ■ ■, R-fJi are 
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deterministic functions of X,X^^\ ... (recall that we have fixed ..., it follows 

that 

Si+2, 'S'[l,i + 1] , 'S’fl j+l] ) ■ ■ ■ ) ! -^[1,1 + 1] ! + ! ■ ■ ■ ! + \ • • • ! ^ 

«,.«+2. (7„, S|i,i+,|, ..., s» ....X. xw...., x«. 

Also, we maintain at each step that (A, ..., is independent from (A, ..., A^*)). This 

completes the proof. □ 

□ 

Remark 6.7. lAe note that if instead of using a strong seeded extraetor to generate Ri (recall 
Ri = Ext^(A, S'i)j, we used the extractor constructed by Raz \R,az05^ . then the error achieved is 
0{ue). 


6.3 Construction of Some Key Components 


In this section, we construct functions which are key ingredients in all our explicit extractor con¬ 
structions. It is based on a new way of using the technique of alternating extraction, and is inspired 
by a recent elegant work of Cohen [Cohl5| on constructing local correlation breakers. 

We define the following function which is inspired by the “flip-flop” method introduced by 
Cohen |Cohl5] . 

We now prove the following lemma. 


Lemma 6 . 8 . Let b, : h G [j]} be j + 1 bits such that for all h £ [j], b b^^\ Let X be a 
kyj)-source and let {A^^^ : h G [j]} be random variables on { 0 , 1 }"™ that are arbitrarily correlated 
with A. Let A, : h G [j]} be arbitrarily correlated random variables that are independent of 

(A, {A(^) : h G [i]}). Suppose that A is a {ny,ky)-source, ky = Uy — \, each random variable in 
{A(^) ; fi ^ [j]j ig on Uy bits. Let Qi be some function of Y on Uq bits with min-entropy at least 
Uq — X, and for each h G [j], let be an an arbitrary function o/A, {A(“) : a G [j]} on Uq bits. 


Let 21aExt be the function computed by Algorithm 1. Let 21aExt(A, A, Qi,b) = Qi+i, and for h G 
[j], let 2\aE-xX.{X^^\Y^^\Qf'\b^^'>) = Q^^i. Suppose ky > max{A:, fei} -|- 10 (jn^ -|- jm + log ( 7 )), 
kw > k + 10 (jm + log ( 7 )), and Uq > k + lOjm + 21 og( 7 ) -|- A. 


Then with probability at least 1—e', where e' = 0{2^e), over the fixing of the random variables Qi, 


: h e [j]},R.,i,R,,2,{R)^,R)^ ■ h G [j]},Q,,{Qr ■ h G [j]}, R,,2, < 2 ^ : h G [j]}, 


Jh) n{h) 


7T r7=f(^) 


D D r d(^) 


{Qi+1 : h G [j]} ; (a) Qi+i is e'-close to Un^ and is a deterministic function ofY (b) The random 
variables (A, {A^^^ : h G [j]}) and {Y,{Y^^'> : h G [j]}) are independent (c) X has min-entropy at 
least kw — 10 (jm + log ( 7 )) and A has min-entropy at least ky — 10 (juq + jm + log ( 7 )). 


Proof. Notation: For any function R, if A = H(X,Y), let denote the random variable 
R(A(“),A(“)). 

We split the proof into two cases, depending on b. 

Case 1: Suppose 6 = 1. By Lemma [631 it follows that 

R^+{R!l^|■h£[j\},Q,,{Qf'^■.h£[J\} 

Um,{R!i^l : h G : h G [j]} 
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Algorithm 1: 21aExt(x, y, g*, 6 ) 

Input: Bit strings x,y,qi of length nw,ny,nq respectively, and a bit b. 

Output: A bit string of length Uq. 

Subroutine: Let Extg : {0, !}”■« x {0,1}*” ^ {0,1}™ be a strong seeded extractor set to 
extract from min-entropy k with error e and seed length m. Let Extii, : {0, 1 }”™ x{0,1}"* ^ {0, 
1 }™ be a strong seeded extractor set to extract from min-entropy k with error e and seed length 

d. 

Let laExt : {0,1}"'“' x {0, {Q, l} 2 "i ijg the look ahead extractors dehned in Section 

16.21 for an alternating extraction protocol with parameters m,u = 2 (recall u is the number of 
steps in the protocol, m is the length of each random variable that is communicated between 
the players), and using Extg,Ext^ as the strong seeded extractors. 

Let Ext : {0,1}"'!' x {0,1}”^ —{0,1}""? be a strong seeded extractor set to extract from 
min-entropy ki with error e. 

1 Let = Slice(gj,m) 

2 Let laExt(a;, (gi,Si,i)) = 

3 if 6 = 0, let ft = Ext{y, rj^) 

4 else let ft = Ext(?/, ri^ 2 ) 

5 endif 

6 Let = Slice(ft, m). 

7 Let laExt(x, (ft,^)) = 

8 if 6 = 0, let gj+i = Ext(?/,r^) 

9 else let g^+i = Ext(y,fi(T) 

10 endif 

11 Ouput Qi+l- 


, where ei = c2^e, for some constant c. Thus, we can fix {R^^i : h G [j]},Qi, {Qi^^ ■ h G [j]}, and 
with probability at least 1 — 0(ei), Ri^2 is 0(ei)-close to Um- Note that Ri^2 is now a deterministic 
function of X. Further, by Lemma 13.111 Y loses min-entropy at most {j + l)nq + log (i) with 

probability at least 1 — e due to this fixing. Since on fixing : h G [j]}, the random 

variables : h G [j]} are deterministic function of X,{X^^^ : h G [j]}, the source X loses 

min-entropy at most jm + log (i) with probability at least 1 — e due to this fixing. We now note 
that the random variables ■ h G [j]} are deterministic functions of : h G [j]}. Thus, 

we fix {Qj : h £ [j]}, and by Lemma 13.111 Y loses min-entropy at most jnq -|- log (^) with 
probability at least 1 — e due to this fixing. Since Ext extracts from min-entropy ki, and ky was 
chosen large enough, it follows that the random variable Qj is (e -|- ei)-close to Un^ with probability 
at least 1 — 0(ei) even after the fixing. Further, we fix Ri ^2 since Ext is a strong seeded extractor, 
and by Lemma 13.111 X loses min-entropy at most m Y log (i) with probability at least 1 — e due 
to this hxing. Thus is now a deterministic function of Y . We now fix the random variables 
{Rf ^2 • ^ ^ b]}’ noting that they are deterministic functions of X and hence does not affect the 
distribution of Qj. X loses min-entropy at most jm + log (i) with probability at least 1 — e due to 
this hxing. 

We now note that the random variables ^ ^ [j]} deterministic function of 

X,{X^^^^^ : j G [/i]} since we have hxed ; h G [j]}. Thus, we can hx {Ri^i,Ri^2 ■ ^ ^ bH 

and X loses min-entropy at most 2jm + log (i) with probability at least 1 — e. Thus it follows 
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by Lemma [631 that \Ri^i,Qi — Um,Qi\ < e + 0(ei). We fix Qj and Y loses min-entropy at most 
Uq + log (i) using Lemma [.I. Ill Finally, we note that : h € [j]} is now a deterministic 

function of Y, {y(^) : h G [j]}. Thus, we can hx : h G [j]} variables and Y loses min-entropy 

at most juq + log (i) with probability at least 1 — e due to this hxing. Further, Ri i is now a 
deterministic function of X. It follows that Qi+i is 0(ei + e)-close to Un^ since ky is chosen large 
enough. We further fix Ri^i noting that Ext is a strong extractor and X loses min-entropy at most 
m + log (i) with probability at least 1 — e due to this fixing. 

Case 2: Now suppose b = 0. We fix the random variables Qi, : h G [j]}. Conditioned on this 

fixing, it follows by Lemma [631 that \Ri^i — Um\ < ei, ei = 0{2^e), with probability at least 1 — e. 
Since Ext is a strong seeded extractor (and ky is large enough) and Ri^i is a deterministic function 
of X, it follows that — Unq,Ri,i\ < e + ei with probability at least e. We fix Ri^i, and 

observe that is now a deterministic function of Y. We can now fix 2 ^ : h G [j]} since 

{R^1^2 ■ ^ ^ [•?]} ^ deterministic function of X,{X^^^ : h G [j]}, and hence does not affect the 

distribution of Q^. As a result of these fixings, it is clear that {X, : h G [j]}) is independent of 

(Yi, ; h G [j]}). Eurther X loses min-entropy of at most 2{j + l)m -|- log (^) with probability 

at least 1 — e, and Y loses min-entropy of at most 2{j + l)nq + {j + l)m + 3 log (|) with probability 

at least 1 — 3e. Note that now Q^, : h G [j]} are deterministic functions of Y, : h G [j]}, 

and Qi is 0(ei)-close to Un^- By Lemma [631 it follows that 

7?,, 2 , (R^l : h G [j]}, {Qf ^ : h G [j]} Um, : h G [j]},Q„ {^f ^ : h G [j]} 


where €2 = c{ei+e+e), for some constant c. Thus, we can fix {r[^i : h G [j]}, Qj, : h G [j]} and 

with probability at least 1 — 0 ( 62 ), i?i ,2 is 0(e2)-close to Um- Note that Ri ^2 is now a deterministic 
function of X. Eurther, by Lemma 13.111 Y loses min-entropy at most {j + Y)nq + log (^) with 

probability at least 1 — e due to this fixing. Since on fixing : h G [j]}, the random 

variables {r!i^i : h G [j]} are deterministic functions of X,{X^^^ : h G [j]}, the source X loses 
min-entropy at most jm + log Q) with probability at least 1 — e due to this fixing. We now note 
that the random variables {Q^^i '■ h G [j]} are deterministic functions of Y, {y(^) : h G [j]}. Thus, 

we hx {Q^^i : h G [j]} and by Lemma [3.111 Y loses min-entropy at most {j + l)ng -|- log (i) with 
probability at least 1 — e due to this hxing. Since Ext extracts from min-entropy ki , (and ky is large 
enough) it follows that random variable Qi+i is 0(e2)-close to Unq even after the hxing. Eurther, we 
hx Ri 2 since Ext is a strong seeded extractor, and by Lemma [3.11[ X loses min-entropy m-|-log (^) 
with probability at least 1 — e due to this hxing. Eurther Qj+i is now a deterministic function of 
Y. Thus we can hx the random variables {Rf 2 ■ ^ ^ bH since they are deterministic function 
sod X and does not affect the distribution of Qi+i- X loses min-entropy at most m -|- log (i) with 
probability at least 1 — e due to this hxing. This completes the proof. □ 


We now construct a function that is a crucial ingredient in our non-malleable extractor con¬ 
structions. (Recall that for any string we use Zj/j} to denote the symbol in the /I’th co-ordinate 
of z.) 

Lemma 6.9. Let z, ..., each be i bit strings such that for all i (z [t], z ^ z^^\ Let X 

be a {ny^,kw)-source and let X^^'i,..., X^^'i be random variables on {0,1}”™ that are arbitrarily 
correlated with X. Let E, ^ ^ y(*) random variables on Uy bits that are independent of 

{X, AB) , ,..., ) . Suppose that Y is a {uy, ky) -source, ky = Uy — X. 
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Algorithm 2: nmExti(x, y, z) 

Input: Bit strings x,y,z of length nw,ny,i respectively. 
Output: A bit string of length Uq. 

1 Let qi = Slice(y, Uq) 

2 for h = 1 to i do 

3 I Qh+i = 2laExt{x,y,qh,z^h}) 

4 end 

5 Ouput qi+i- 


Let nniExti be the function computed by Algorithm 2. Let nmExti(X, E, z) = Qi+x, and for 
/i G [t], /ef nmExti(A(^), Suppose ky > ma.x{k,ki} + 20i (tUq + tm + log , 

kw > k + 20£ (tm + log (i)) and Uq > k + lOtm + 21og(i) + A. Then, we have 


Qt+it Q 


nW 

! Vf+l 


Un,,Q, 


( 1 ) 

£+1’ ■ 


nW 

! Vf+l 


where e'^ = 0{{2^ + t)e). 


Proof. Notation: For any function H, A V = H{X,Y), let denote the random variable 
For /i € [£\, define the sets 

Ind/i = {i € [t] : Indh = [t] \ Ind^, 

Ind[/j] = U,^=ilnd,j, Ind[;,] = [t] \ Ind[;,]. 

We record a simple claim. 

Claim 6.10. For each i G [t], there exists h G [£] such that i G Ind/^. 


Proof. Recall that we have fixed Z, ..., Z^^'^ such that Z 7^ for any z G [t]. Thus it follows 
that for each z G [t], there exists some h G [1] such that Zj/jj 7^ hence z G Ind/j. □ 


We now prove our main claim, which combined with Lemma FO.Bl and a simple inductive argument 
proves Lemma 16.91 


Claim 6.11. For any /z G {0,1, ... ,£}, suppose the following holds: 

With probability at least 1 — Ch over the fixing of the random variables {Qi : i G [/z]}, : i G 


[h],j € [t]},{Ri,i,Ri,2 ■ i G : i G [h],j G [t]},{Qi : i G [h]},{Qy^ : i G [h],j G [t]} 


T^U) 


{Ri,i,Ri,2 : i G [h]},{RiJ,R [^^2 ■ * ^ [h],j G : J G Ind^}: (a) Qh+i is Ch-close to a 

source with min-entropy at least Uq — X and is a deterministic function ofY (b) ■ j G Indj/j]} 

is a deterministic function ofY,{Y^^^ : j G [t]} (c) The random variables {X,{X^R : j G [t]}) 
and (Y, {Y^^'> : j G [t]}) are independent (d) X has min-entropy at least k^j — 10/z {tm + log (i)) > 
k + 10 (tm + log (i)) and Y has min-entropy at least ky — 10/z (tzzg + tm + log (i)) > max{A;, 

/ci} + 10 [tUq + tm + log (i)). 

Then, the following holds: 


U) 
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Let e/i+i = €/i + c2^e for some constant c. With probability at least 1 — Ch+i over the fixing 
of the random variables {Qi : z € [h + : i € [h + l],j G [t]}, Ri ^2 ■ i ^ [h + 1 ]}, 

{Ri^J,Ri^^2 ■ * ^ [^ + l])i G [i]},{Qi : i G [h + l\),{Q^P : i G [h + l\,j G [i]}, : 

i G : i e [/i + l],j G [t]},{Q-+\ : j G Ind[,,+i]}; ("aj Qfe +2 is Ch+i-close to Un^ 

and is a deterministic function of Y (b) ' d ^ is a deterministic function of 

Y,{Y^d) ■ j ^ [i]} The random variables {X,{X^d) ; j ^ [^j}) and {Y,{Y^d) ■ j £ are 
independent (d) X has min-entropy at least — 10(/i + 1) {tm + log (i)) and Y has min-entropy 
at least ky — 10 (/i + 1 ) [tm + log ( 7 )) • 


Proof. We fix the random variables {Qi : i G [h]}, : i G [h],j G [t]}, {Ri i,Ri ^2 '■ i G [/i]}, {Rfl, 

R\^2 ■ * ^ ^ ^ S ^ [h]:j G [t]}, -Ri,2 ^ * G [h]}, {R[^J , R[^^2 ■ ^ ^ 

j ^ W}){Qi+\ • j ^ such that (a), (b), (c), (d) holds (this happens with probability at 

least 1 — e/j. We also fix the random variables [R^^^ : j G Ind[/i]}, noting that they are 


? 0 ') 


deterministic functions of X. Thus X has min-entropy at least km — lOh [jm + log (7)) —tm—log (7) 
with probabilitiy at least 1 — e. Further, Q has min-entropy at least ky — lO/i [tuq + tm + log ( 7 ))- 
The claim now follows directly from Lemma 16.81 □ 


To complete the proof of Lemma I6.9t we now note that the hypothesis of Claim 16.111 is indeed 
satisfied when /i = 0. Thus, by I applications of Claim 16.111 it follows that the Q^+i is e^-close 
to Unq, where e{ = 0{2^e + ie). This follows since for all applications of Claim [6TT] except the 
first time, Qh is e/j-close to uniform, and hence the parameter A = 0. This concludes the proof of 
Lemma 16.91 □ 


6.4 An Explict Seedless (2, t)-Non-Malleable Extractor Construction 

We are now ready to present our construction. We hrst set up the various ingredients developed so 
far with appropriate parameters. 

Subroutines and Parameters 

1. Let 7 be a small enough constant and C a large one. Let t = . 

2. Let ni = fii = IO 7 . Let IP : {0, Ij^^ x {0,1}”! —{0,1}”^, n 2 = be the strong 
two-source extractor from Theorem 13.171 

3. Let C be an explicit [^,n, ^]-binary linear error correcting code with encoder E : {0,1}"" —>■ 
{0, l}a . Such explicit codes are known, for example from the work of Alon et al. |ABN'*~92 . 

4. Let Samp : { 0 , l}"'^ —>• be the sampler from Corollary 16.31 with parameters (5samp = 7 ^ 

and z^samp = fii- Let the number of samples tsamp = Thus, ^2 < fii- 

5. Let £ = 2(n^i -|- n^^). Thus £ < . 

6 . We set up the parameters for the components used by 21aExt (computed by Algorithm 1) as 
follows. 

(a) Let na = with = IOO 7 and = 5 O 7 . 
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Let Extq ; {0,1}”^ x {0,1}""^ —{0,1}""* be the strong seeded linear extractor from 
Theorem 13.141 set to extract from min-entropy kq = ^ with error e = jq = ^. 

Thus, by Theorem 13.141 we have that the seed length dq = O ~ = 

714. 

Let Ext^ : {0,1}” x {0,1}”"' —>■ {0, l}"'^ be the strong linear seeded extractor from 
Theorem 13.141 set to extract from min-entropy ^ with error e = 

(b) Let laExt : {0,1}"' x {0,1}”® —> {0, l}2"-4 be the look ahead extractor used by 21aExt 
(recall that the parameters in the alternating extraction protocol are set as m = n^, 
u = 2 where u is the number of steps in the protocol, m is the length of each random 
variable that is communicated between the players, and Extg, Ext^ are the strong seeded 
extractors used in the protocol.). 

(c) Let Ext : {0,1}"' x {0,1}"'"' {0,1}”^ be the linear strong seeded extractor from Theorem 

13.141 set to extract from min-entropy ^ with seed length 774 and error 

7. Let nniExti be the function computed by Algorithm 2, which uses the function 21aExt set up 
as above. 


Algorithm 3: nmExt(x,y) 

Input: Bit strings x, y, each of length n. 

Output: A bit string of length 77 , 4 . 

1 Let xi = Shce(3:,TT-i), 7/1 = Slice(7 /,tt-i). Compute v = IP(3:,7/). 

2 Compute T = Samp(7;) C [^]. 

3 Let z = xi o X 2 o yi o 7/2 where X 2 = (£'(x)){t}, 7/2 = iE{y))[T}- 

4 Output nmExti(x, 7 /, z). 


We now state our main theorem. 

Theorem 6.12. Let nmExt be the function computed by Algorithm 3. Then nmExt is a seedless 
{2,t)-non-malleable extractor with error 

We establish the following two lemmas, from which the above theorem is direct. 

Lemma 6.13. nmExt : {0,1}"' x {0,1}" —)■ {0,1}""' satisfies the following property Vn-' If X,Y 
are independent {n,n — n"^)-sources and Ai = {fi, gi),... ,At = {ft,gt) o.fe arbitrary 2-split-state 
tampering functions, such that for each is [t], at least one of fi,gi has no fixed points, then the 
following holds: 


|nmExt(A, T), nmExt(Ai(A, Y )),..., nmExt(At(A, T)) — 
Un 4 ,nmEx.t{Ai{X,Y)),... , nmExt(A(A, y))| < e, 


where e = 2 . 

Lemma 6.14. Suppose nmExt : {0,1}" x {0,1}" —)■ {0,1}""^, satisfies property Vn (from Lemma 
16.1311 . Then, nmExt is a seedless {2,t)-non-malleable extractor with error (2“"^ -|-e)2^*. 

Notation: For any function H, if V = H{X, Y), let 17^*^ denote the random variable H{Ai{X, 

y))- 
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Proof of Lemma I6.13L We begin by proving the following claim. 

Claim 6.15. With probability at least 1 — Z ^ for each i £ [t]. 

Proof. Pick an arbitrary i £ [t]. Without loss of generality, suppose fi has no fixed points. If 

^ XP or Yi / then Z / Z(*). Now suppose Xi = and Yi = We fix Xi, and 

note that since IP is a strong extractor (Theorem 13.17p . V is 2“^("'i)-close to C /„2 after this fixing 
(with probability at least 1 — Also note that V = 

Since fi has no fixed points, it follows that since E is an encoder of a code with relative distance 
distance A(i?(X), £;(X«)) > Let D = {j £ [^] : E{X){^} + Thus |Z1| > 

Using Corollary 16.31 it follows that with probability at least 1 — \D n Samp(U)| > 1, and 

thus X 2 7 ^ X^'^ (since Samp(U) = Samp(U(*))). This proves the claim. □ 

We fix Z, Zft^ ) ■ ■ ■ j Z*-*^ such that Z 7 ^ Z*-*^ for any i £ [t] (from the lemma above, this occurs 
with probability 1 — 2“”' ). We note that by the Lemma [6.15l and Lemma lS.lll each of the sources 

X and Y still has min-entropy at least n — rY — {t + 1)£ — > n — with probability at 

least 1 — 2 “”^^^°. 

Lemma 16.131 now follows directly from Lemma 16.91 by noting that the following hold by our 
choice of parameters: 

• n — > § + 20 (n^i + n^'^){rY /^+ n^‘^) 

• nhs > |(20tn^'‘ + 


This concludes the proof. □ 

Proof of Lemma 16.141 Let .Ai = {fi,gi),. .. ,At = {ft,gt) be arbitrary 2-split-state adversaries. We 
partition {0,1}” in two different ways based on the fixed points of the tampering functions. 

For any RE [t], define 

= {x £ {0,1}” : fi{x) = X if i £ R, and fi{x) / x if f G [f] \ ii}. 


Similary, for any 5 C [t], define 

= {y £ {0,1}” : gi{y) =y ifi £ S, and gi{y) 7 ^ y if z G [f] \ S}. 


Thus the sets W^^\R C [t] defines a partition of {0,1}”. Similarly V^^\S C [t] defines a partition 
of {0,1}”. For R,SE [t], let X^^^ be a random variable uniform on and Y^^'i be a random 

variable uniform on 


Let Um be uniform on {0,1}”'‘ and independent of X^, Y^, for all R,SE [t]. 


Dehne 


f,9 ^ 


714 ! 


{R,S) 


...,Z, 


{R,S)i 


where we define the random variable 


^{ES) 


nmExt(/,(X(«)), 5 i(y(^))) if z G [t] \ (RDS) 
same* if i £ Rn S 


34 














Define the distribution: 


R,S 


{R,S) 

f,g 


|14/(-R,S)||y{-R,S)| 

, where or^s = ' - ^ -• 

We first prove the following claim. 

Claim 6.16. Let 


^R,S = aR,s|nmExt(X(^\y(^)),nmExt(/i(X(^)),5ri(y(^))),..., 

nmExt(/t(X(«)),5i(y(®))) - 

J iQ 

Then, for every R,SC [t], Ar^s < 2“"'^ + e. 

Proof. If it follows that q;_r ,5 < 2“"'^, and hence the claim follows. Thus, assume 

that > n — n'^. Using a similar argument, we can assume that > n — n'^. 

Let Rn S = [f] \ (i? n 5) = {fi,..., ij}. It follows that for any c G i? n S', at least one the 
following is true: (1) fc has no fixed points on lU*'^) (2) gc has no fixed points on Thus, 

invoking Lemma 16.131 we have 

|nmExt(x(^), nmExt(/*, (T^^^)),... , nmExt(/*^ 

- Un ,, nmExt (/i, ), g,^iY^^^)),... , nmExt (/,, (X^^)), g^^ (T ^^^)) | < e 

The claim now follows by observing that for each c G i? D S', fc and gc are the identity functions 
on the sets and respectively. □ 


Let X, y be independent and uniformly random on {0,1}"". Thus, we have 

|nmExt(X, y), nmExt(.4.i(X, y)), ..., nmExt(.4.t(X, y)) 
-C/„„copyW(D^^., Un,)\ = Y. < 2^\e + 2-""). 

iJ,SC[t] 

Thus nmExt is a (2,t)-non-malleable extractor with error (e + 2~”'^)2^*. 


□ 


7 An Explict Seeded Non-Malleable Extractor at Poly logarithmic 
Min-entropy 

Subroutines and Parameters 


1. Let 7 be a small enough constant and C a large one. Let t,k,d be parameters such that 
t < 

2. Let ni = log(^). Let Ext* : {0,1}” x {0, l}”i —)• {0, l}”i be the strong seeded extractor 
from Theorem 13.151 set to extract from min-entropy 2ni and error 

3. Let C be an explicit [^, d, ^]-binary linear error correcting code with encoder E : {0,1}'^ —)■ {0, 
l}a. Such explicit codes are known, for example from the work of Alon et al. [ABN'*~9^ . 
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4. Let Extsamp : {0, x {0, ^ {0,1}”^ be the strong seeded extractor from Theorem 13.161 

set to extract from min-entropy ^ with error ^ and output length n 2 , such that N 2 D 1 = 
where N 2 = ^nd Di = 2*^1. Let {0,1}'^^ = {si,... , S£)j}. Define Samp : {0, —>■ 

as: Samp( 3 :) = (Ext( 3 :, si) o si,..., Ext( 3 :, sdx) o sdi)- By Theorem 13.161 we have Di = cini, 
for some constant Ci- 

5. Let ^ = ni + Di = (ci + l)ni. 

6 . We set up the parameters for the components used by 21aExt (computed by Algorithm 1) as 
follows. 

(a) Let res = cst^, 714 = 10^, for some large enough constant C 3 . 

Let Extq : {0,1}"^ x {0,1}”"‘ —)■ {0,1}”"‘ be the strong seeded extractor from Theorem 
13.151 set to extract from min-entropy kq = ^ with error e = 

Let Ext^ : {0,1}” x {0, l}”'^ —> {0,1}"''‘ be the strong seeded extractor from Theorem 
13.151 set to extract from min-entropy | with error e = 

(b) Let laExt : {0,1}” x {0, l}” 3+"'4 |q^ ]^j. 2 n 4 ahead extractor used by 21aExt. 

Recall that the parameters in the alternating extraction protocol are set as m = n^, 
u = 2 where u is the number of steps in the protocol, rre is the length of each random 
variable that is communicated between the players, and Extg, Ext^o are the strong seeded 
extractors used in the protocol. 

(c) Let Ext : {0, l}*^ x {0,1}""^ —>■ {0,1}"'® be the strong seeded extractor from Theorem 13.151 
set to extract from min-entropy | with seed length re 4 and error 

7. Let nmExti be the function computed by Algorithm 2, which uses the function 21aExt set up 
as above. 

8 . Let res = Let Exti : {0,1}"' x {0, l}”'^ {0,1}"'® be the strong seeded extractor from 

Theorem 13.151 set to extract from min-entropy | with seed length re 4 , error 


Algorithm 4: snmExt(x,y) 

Input: Bit strings x,y, of length n,d respectively. 
Output: A bit string of length re 4 . 

1 = Slice(y, rei). Compute u = Exts(x, yi). 

2 Compute T = Samp(7;) C [^]. 

3 Let z = yi o y 2 where y 2 = {E{y)){T}- 

4 Output Exti(a;, nmExti (x, y, 2 :)). 


We now state our main theorem. 

Theorem 7.1. Let snmExt : {0,1}” x {0,1}'^ —>■ {0,1}”® he the function computed by Algorithm 
4. Then snmExt satisfies the following property: For any e > 0, /c > C'log^^'^ t < and 
d > Ct^log^ if X is a {n,k)-source, and Y is an independent and uniform distribution on 
{0,1}'^, and Ai... ,At are arbitrary tampering functions, such that for each i G [t], Ai has no fixed 
points, then the following holds: 

I snmExt (A, T), snmExt (A, Ai{Y)),..., snmExt (A, At{Y)),Y — 

1/^5, snmExt (A, Ai(y)),..., snmExt(A, At{Y)), Y\ < 0(e), 
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Notation: For any function H,\IV = H{X,Y), let denote the random variable H{X, 

A{y)). 

Proof. We first prove the following claim. 

Claim 7.2. With probability at least 1 — e, Z ^ for each i € [t]. 

Proof. Pick an arbitrary i G [t]. If Yi 7 ^ Y^^\ then we have Z 7 ^ Z^^'l . Now suppose Yi = Yj^\ We 
fix li, and note that since Ext^ is a strong extractor fTheorem l3.17l) . B is 2“^("'i)-close to C/„j. 

Since Ai has no fixed points, it follows that since E is an encoder of a code with relative distance 
distance A(E(y),E(y«)) > Let V = {j e [|] : Thus \V\ > 

Using Theorem 16.21 it follows that with probability at least 1 — e, |P n Samp(U)| > 1, and thus 
Y 2 A ^ 2 *^ (since Samp(U) = Samp(U(*))). The claim now follows by a simple union bound. □ 

We fix Z, ZT),..., Z*-*^ such that Z 7 ^ Z*-*^ for any i G [t] (from the lemma above, this occurs 
with probability 1 — e). We note that by the Lemma 17.21 and Lemma 13.111 the source X has min- 
entropy at least k — 2ni and the source Y has min-entropy at least d — 2i with probability at least 
1 - e. 

Lemma 16.131 now follows directly from Lemma 16.91 by noting that the following hold by our 
choice of parameters: 

• f > 20i{t{n3 + n4) + log(7)) 

• k - 2ni > ^ + 20 ^(tn 4 + log(i)) 

• ns — 2ni > |( 10 tre 4 + 21og(i)) 

This concludes the proof. □ 

8 Efficient Encoding and Decoding Algorithms for One-Many Non- 
Malleable Codes 

In this section, we construct efficient algorithms for almost uniformly sampling from the pre-image 
of any output of a modified version of the (2, t)-non-malleable extractor constructed in Section [6j 
Combining this with Theorem 15.11 and Theorem 16.121 gives us efficient constructions of one-many 
non-malleable codes in the 2-split state model, with tampering degree t = relative rate 

n^T) ffi and error . 

A major part of this section is on modifying the components used in the construction of nmExt 
(Algorithm 3) so that the overall extractor is much simpler to analyze as a function, and this 
enables us to develop efficient sampling algorithms from the pre-image. We present the modified 
extractor construction in Section 18.21 However, we first need to solve a simpler problem. 

8.1 A New Linear Seeded Extractor 

A crucial sub-problem that we have to solve is almost uniformly sampling from the pre-image of a 
linear seeded extractor in polynomial time. Towards this, we recall a well known property of linear 
seeded extractors. 
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Lemma 8.1 ( |Rao09| ). Let Ext : {0,1}” x {0,1}'^ —)• {0,1}™' he a linear seeded extractor for 
min-entropy k with error e < ^. Let X be an affine {n,k)-source. Then 

Pr [|Ext(X, tt) — Um\ > 0] < e. 

u^Ud 

□ 

Definition 8.2. For any seeded extractor Ext : {0,1}” x {0,1}'^ —> {0,1}™, any s G {0,1}“^ and 
r G {0,1}™, we define: 

• Ext(-, s) : {0,1}” ^ {0,1}™ to be the map Ext(-, s){x) = Ext(a;, s). 

• Ext“^(r) to he the set {(x, y) G {0,1}” x {0, l}'^ : Ext(x, y) = r}. 

• Ext“^(',s) to he the set {x : Ext(3:,s) = r}. 

We now present a natural way of sampling from pre-images of linear seeded extractors. 

Claim 8.3. Let Ext : {0,1}” x {0,1}'^ —>■ {0,1}™ be a linear seeded extractor for min-entropy k 
with error e < For any r G {0,1}™, consider the following efficient sampling procedure S 

which on input r does the following: (a) Sample s ~ Ud, (b) sample x uniformly from the subspaee 
Ext(-, s)“^(r). (e) Output {x,s). Let he the distribution uniform on Ext~^(r), and let S{r) 
denote the distribution produced by S on input r. 

Then, 

\S{r)-Vr\ < 


Proof. Define the sets: 

Good = {s G {0,1}'^ : rank(Ext(-, s)) = m}, Bad = {0, l}'^ \ Good. 

It follows by Lemma ISTTl that |Good| > (1 — e)2‘^. Thus, for any s G Good, |Ext(-, s)“^(r)| = 2”“™ 
Thus, we have 

^ |Ext-n-,s)(r)| > 2'='+”-™-^ 

s£Good 

Further, for any s' G Bad, |Ext~^(-, s^)(r)| < 2”, and hence 

^ |Ext-n-,s')(^)l < 

s'£Bad 

Thus I Ext“^(-, s')(r)| < 2“°-^™|Ext“^(r)|. It now follows that 

\S{r)-Vr\ < 


□ 


We note that e must be o(2“™) for the above sampling procedure to work with low enough 
error. However, this would require a seed length of d = O(m^) (by Theorem 13.141) . For each step 
of the alternating extraction protocol the seed length then goes down by a quadratic factor, which 
is insufficient for our application. 

To get past this difficulty, we construct a new strong linear seeded extractor for high min- 
entropy sources with the seed length close to the output length with the property that the size 
of the pre-image of any output is the same for any fixing of the seed. Algorithm 5 provides this 
construction. 

Parameters and Subroutines: 


38 






1. Let (5 > 0 be any constant. Let d = . Let d = di + d 2 , where di = S > lOJi. Let 

m = d/2. 

2. Let Samp : {0, [n]*, t = ^ 2 ) be an (/i, 0 , 7 ) averaging sampler with distinct samples, 

such that Ai = ^ ^ = iog(i/r) ^ r = 0.05. 

3. Let IP : {0,1}'^^ x {0,1}'^^ {0, 1}2 be the strong 2 -source extractor from Theorem 13.171 


Algorithm 5: iExt(x,s) 

Input: Bit strings x, s of length re, d respectively. 

Output: A bit string of length m. 

1 Let Si = Slice(s,di). Let S 2 be the remaining ^2 bits of s. 

2 Let T = Samp(si) C [re]. Let xi = x^x}- 

3 Output IP (xi, S 2 ). 


Informally the construction of iExt is as follows. Given a uniform seed S', we use a slice Si 
of S to sample co-ordinates from the weak source X, and then apply a strong 2-source extractor 
(based on the inner product function) to the source Xi (which is the projection of X to the sampled 
co-ordinates) and the remaining bits S 2 of S to extract ^ uniform bits. 

The correctness of this procedure relies on the fact that by pseudorandomly sampling co¬ 
ordinates of X and projecting X to these co-ordinates, the min-entropy rate is roughly the pre¬ 
served for most choices of the uniform seed |Zuc97| |Vad04| |Lil2a] . Thus, we can fix 5i, and the 
strong two-source extractor IP now receives two independent inputs S 2 and X 2 with almost full 
min-entropy. Thus, the output is close to uniform. Further we show that the number of linear 
constraints on the source X is the same for any fixing of the seed. This allows us to show that size 
of the pre-image of any particular output is the same for any fixing of the seed. We now formally 
prove these ideas. 

We need the following theorem proved by Vadhan [VadOd] . 

Theorem 8.4 1 |Vad04] ). Let 1 > d > 3r > 0. Let Samp : {0, 1}'' —)• [re]* be an (//, 0 , 7 ) averaging 
sampler with distinct samples, such that p, = and 9 = ■ If X is a (re, dre) source, then 

the random variable {Ur, X^samp{Ur)}) (t + 2“**(™))-cZose to {Ur, W) where for every a G {0,1}^ 
, the random variable W\Ur = a is a {t, (5 — 3T)t)-source. 

Lemma 8.5. Let iExt be the function computed by Algorithm 5. If X is a (re,0.9re) source and S 
is an independent uniform seed on { 0 , 1 }**, then the following holds: 

\iExt{X, S), S -Um,S\ < 

Further for any r G {0,1}*” and any s G {0,1}'*, |iExt(-, s)“^(r)| = 2”“”*. 

Proof. Using Theorem 18.41 it follows that Xi is 2“"'^^^'-close to a source with min-entropy at least 
0.8re for any hxing of Si. Further, we note that after fixing ^i, S 2 and Xi are independent sources. 
We now think of Xi,S 2 as sources in {0, by appending a 1 to both the sources, so that 

S 2 7 ^ 0, and then apply the inner product map. This results in an entropy loss of only 1. It now 
follows by Theorem 13 .1 71 that 

|iExt(X,5),5- 
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It is easy to see that for any fixing of the seed S = s, iExt(-,s) is a linear map. Let X be 
uniform on n bits. We note that for any fixing of S 2 = S 2 , Xi lies in a subspace of dimension 

6.2 — m over F 2 . Further, the bits outside T have no restrictions placed on them. Thus the size of 

iExt(-, s)“^(r) is exactly 2 '^ 2 -m+n-d 2 _ This completes the proof of the lemma. □ 

Based on the above lemma, we construct an efficient procedure for sampling uniformly from the 
pre-image of the function iExt. 

Claim 8.6. Let iExt : {0,1}” x {0, ^ {0,1}™ be the function computed by Algorithm 5. Then 

there exists a polynomial time algorithm Sampj^ that takes as input r G {0,1}”*, and samples from 
a distribution that is uniform on iExt~^(r). 

Proof. It follows by Lemma [8.51 that for any fixing of the seed s, the size of the set iExt(-, s)“^(r) is 
exactly 2”'“”*. Thus we can use the following strategy: (a) Sample s ^ Ud (b) Sample x uniformly 
random from the subspace iExt(-, s)“^(r) (c) Output (x, s). It follows that each element in iExt“^(r) 
is picked with probability exactly ^ Thus the output of our sampling procedure is indeed 

uniform on iExt~^(r). □ 

8.2 A Modified Construction of the Seedless (2, t)-Non-Malleable Extractor 

We first describe the high level ideas involved in modifying the construction of nmExt (Algorithm 
3), before presenting the formal construction. 

• We use the linear seeded extractor iExt (Algorithm 5) for any seeded extractor used in the 
construction of nmExt. 

• Next we divide the sources X and Y into blocks of size respectively for a small constant 
6 . Since each of X and Y have almost full min-entropy, we now have two block sources, where 
each block has almost full min-entropy conditioned on the previous blocks. The idea is to use 
new blocks of X and Y for each round of alternating extraction in nmExt. 

To implement this however, we need some care. Recall that the alternating extraction protocol 
is run for two rounds between either X and Qh, or X and in the function 21aExt. The 
idea now is to run these two of alternating extraction by dividing into two blocks, and 
using two new partitions of X (each round being run by using a block from either X or Qh)- 
Now to generate these QhS, we use a 0{t) blocks of Y, and for each block apply the strong 
seeded extractor iExt, using as seed the output of the alternating extraction from the previous 
step, and finally concatenate the outputs. This works because these 0{t) blocks form a block 
source, and using the same seed to extract from all the blocks is a well known technique of 
extracting from block sources. 

• By appropriate setting of the lengths of the seeds in the alternating extraction, we ensure that 
each block of X and Y still has min-entropy rate 1 —o(l) even after hxing all the intermediate 
seeds, the random variables QhiQh their tampered versions. This can be ensured since 
each of these variables are of length at most for some small constant 5i, and the number 
of adversaries is also 

• The above modification is almost sufficient for us to successfully sample from the pre-image of 
any output. One final modification is to use a specific error correcting code (the Reed-Solomon 
code over a field of size n -|- 1 with characteristic 2) in the initial step of the construction, 
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when we encode the sources and sample bits from it. We give some intuition as to why this is 
necessary. Since we are using linear seeded extractors in the alternating extraction, by fixing 
the seeds we impose linear restrictions on the blocks of X and Y. Now, if we fix the output 
of the initial sampling step (the random variable Z in Algorithm 3), we are imposing more 
linear constraints on the blocks (assuming we are using a linear code). Now, it is not clear 
if the constraints imposed by the linear seeded extractor is independent from the constraints 
imposed by Z, and thus for different fixings of the Z and the seeds the size of the pre-image 
of any output of the non-malleable extractor may be different. 

To get past this difficulty, our idea is to first partition X and Y into slightly smaller blocks 
(which does not affect the correctness of the extractor) such that at least half of the blocks are 
unused by the alternating extraction steps. Now, we show that by using the Reed-Solomon 
code over F = F 2 iog(n+i) to encode the sources, fixing Z imposes linear constraints involving 
the variables from these unused blocks, and we show that this is sufficient to argue that it 
is linearly independent of the restrictions imposed by the alternating extraction part. We 
provide complete details of the sampling algorithms in Section 18.31 

We now proceed to present the extractor construction. Recall that if Za, Za+i, ■ ■ ■, Zf, are 
random variables, we use Z^a,b] to denote the random variable Za, ..., Zh. 

Subroutines and Parameters (used by Algorithm 6, Algorithm 7, Algorithm 8) 

1. Let 7 be a small enough constant and C a large one. Let t = . 

2. Let ni = (5\ = IO7. Let n 2 = n — ni. Let IPi : {0,1}”^ x {0, 1}^^ —)• {0,1}”®, ^^3 = 70 

be the strong two-source extractor from Theorem 13.171 

3. Let F be the hnite held F 2 iog(n+i). Let 714 = Reed-Solomon 

code encoding 774 symbols of F to n symbols in F (we overload the use of RS, using it to 
denote both the code and the encoder). Thus RS is a [n, n 4 ,n — n 4 + 1]„ error correcting 
code. 

4. Let Samp : {0,1}”^ —)■ [n]"® be a (/r, 2“”'^^^^) averaging sampler with distinct samples. By 

using the strong seeded extractor from Theorem 13.151 we can set = n^'^, (32 < (3i/2. 

5. Let i = 2(ni -|- log n) < . Thus £ < 

6. Let hq = bOCti. Let IP 2 : {0,1}"® x {0,1}"® ^ {0, l}^"'?, Uq = lOCti, be the strong two- 
source extractor from Theorem 13.171 

7. Let nj = n — ni — uq. Let Ux = Let Uy = Thus nx,ny > 

8 . Let di = 80^. 

9. Let iExti ; {0,1}"“" x {0,1}'^^ —)• {0, l}'^^, ^2 = Y)£, be the extractor computed by Algorithm 

5. 

10. Let iExt 2 ; {0,1}"'^ x {0,1}'^^ —^ {0,1}'^®, dz = 20£, be the extractor computed by Algorithm 

5. 

11. Let iExts : {0,1}"^ x {0,1}“^® —{0,1}'^"‘, d^ = 10^ be the extractor computed by Algorithm 

5. 
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12. Let iExt 4 : {0,1}”'' x {0,^ {0,1}'^®, = 5^, be the extractor computed by Algorithm 

5. 

13. Let Ext : {0, 1}4 C*’^b x {0,1}'^'^ —)• {0, be defined in the following way. Let ui,..., v^t be 
strings, each of length Uy. Define Ext(ni o ... o v^ct, s) = iExt 4 (ni, s) o ... o iExt 4 (n 4 Ct, s). 


Algorithm 6 : inmExt(x,y) 

Input: Bit strings x, y, each of length n. 

Output: A bit string of length m. 

1 Let xi = Slice(a:,ni), yi = Slice(?/,ni). Compute i' = IPi(x,y). 

2 Let X 2 ,y 2 be n 2 length strings formed by cutting xi,yi from x,y respectively. 

3 Let T = Samp(z^) C [n]. 

4 Interpret X 2 , 2/2 as elements in F”"*. 

5 Let X 2 = RS(x 2 ),y 2 = ^ 8 ( 2 / 2 ). 

6 Let xi = {x2){T},yi = {y 2 ){T}, interpreting X 2 ,y 2 S F”. 

7 Let z = xi oxi o yi oy^^ where 2 ; is interpreted as a binary string. 

8 Interpret X 2 , 2/2 as binary strings. 

9 Output inmExti( 2 : 2 , 2 / 2 )' 2 )- 


Algorithm 7: inmExti(x 2 , 2 / 2 :- 2 ) 

1 Let X 3 = Slice( 3 : 2 , n-e), 2/3 = Slice( 2 / 2 , ne). Let w,v be the remaining parts of X 2 , 2/2 
respectively. 

2 Let IP 2 (x 3 , 2 / 3 ) = ( 91 , 1 , 91 , 2 ): where each of 91 , 1 , 91,2 is of length n^. 

3 Let ici,..., be an equal sized partition of the string w into 8£ stings. 

4 Let ui,..., viQti be an equal sized partition of the string v into IQCti stings. 

5 for h = 1 to i do 

6 I (9h+l,l: 9h+l,2) = 2ilaExt(U[8c(/j_i)i+l , U;[4/j_3 4/j], (?/i^2: 

7 eud 

8 Ouput ( 9 £+i,i, 9 £+i, 2 )- 


Theorem 8.7. Let inmExt be the function computed by Algorithm 7. Then inmExt is a seedless 
{ 2 ,t)-non-malleable extractor with error 

The proof of the above theorem is essentially the same as the proof provided in Section [6l and 
we do not repeat it. The correctness of inmExt follows directly from the proof of Theorem 16.121 
and the correctness of the extractor iExt iLemma 18.51) . the fact that by our choice of parameters 
each block of X and Y still has min-entropy rate at least 0.9 after appropriate conditioning of the 
intermediate random variables and their tampered versions, and the fact that using the RS in place 
of a binary error correcting code does not affect correctness of the procedure. 

8.3 Efficiently Sampling from the Pre-Image of inmExt 

Since the construction of the non-malleable extractor inmExt (Algorithm 6, Algorithm 7, Algorithm 
8) is composed of various sub-parts and sub-functions, we first argue about the invertibility of these 
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Algorithm 8: 2ilaEx.t{v[sc{h-i)t+i,8Cht ], 'W[4h-3,Ah],qh,i,qh,2, h, b) 

1 Let Sh,i = Slice{qh,i,di), rh,i = Exti{w 4 h- 3 , Sh,i), Sh,2 = E^t2{qh,2,rh,i), 
rh,2 = Ex.t3{w4h-2,Sh,2)- 

2 if 6 = 0 then 

3 I Let rh = Slice (r/i^ 1, (i 4 ). 

4 else 

5 I Let rh = rh,2 

6 end 

7 Let Ext{v[sc{h-i)t+i,8{h-i)t+iCt],rh) = ( 9 h,ii 9 h, 2 )> where both 2 are of length Ug. 

8 Let Sh,i = Shce(g;j i,(ii), r^,! = Exti(ir 4 /i_i, 5^,1), Sh,2 = Ext2{qh,2^rh,i), 
rh,2 = Exi3{w4h,Sh,2)- 

9 if 6 = 0 then 

10 I Let Yh = rh,2- 

11 else 

12 I Let r/i = Slice(r/j^i, ^4). 

13 end 

14 Let Ext(u[8C'(/i-i)t+4Ct+i,8C/it]:r/i) = {qh+i,i,qh+i,2), where both qh+i,i,qh+i,2 are of length 

riq. 

15 Ouput {qh+i,i,qh+i,2)- 


parts and then show a way to compose these sampling procedure to sample almost uniformly from 
the pre-image of inmExt. We refer to all the variables, sub-routines and notations introduced 
in these algorithms while developing the sampling procedures. Unless we state otherwise, by a 
subspace we mean a subspace over F 2 . 

We first show how to sample uniformly from the pre-image of 2ilaExt (Algorithm 8 ), since it is 
a crucial sub-part of inmExt. We have the following claim. 

Claim 8.8. For any fixing of the variables {si_j, ri_j, : i € {1,2}}, and any b € (0,1} define 

the set: 


2ilaExt ^{q 2 ,i,q 2 , 2 ) = {ix3,y3,V[i,4Ct],W[i,i]) ^ {0,1}^”®+'^'^*”='+^”" : 

2ilaExt(u[i^4Ct],'W[i,4], 91,1,91,2, &) = ( 92 , 1 , 92 , 2 )} 

There exists an efficient algorithm Samp 2 that takes as input 92 , 1 , 92 , 2 , {'Si,i, : i G (1, 

2 }}, and samples uniformly from 2 ilaExt“^((; 2 ,i, 92 , 2 )- 

Further, the set 2ilaExt“^ ( 92 , 1 , 92 , 2 ) is a subspace over ¥2 of dimension di, and its size does not 
depend on the inputs to Samp 2 . 

Proof. The general idea is that by hxing the seeds in the alternating extraction, each block of w 
takes values independent of the fixing of the other blocks of w and the 9 i,j’s, and similarly the 9 i,j’s 
takes values independent of each other and the blocks of w. We now formally prove this intuition. 
Since, is a slice of it follows that qi^i is restricted to the subspace of size Since 

= iExti(u)i, si^i), it follows that wi is restricted to the set iExti(-, si^i)“^(ri^i). Further, it 
follows by Lemma 18.51 that this is a subspace of size Similar arguments show that qi ^2 is 

restricted to the subspace of dimension and W 2 is restricted to a subspace of dimension 

2 nx-d4,^ Further, we note that each of these variables have no correlation. 


43 




By repeating this argument for the next two rounds of alternating extraction, it follows that 
Qi I is restricted to a subspace of size 2^9is restricted to a subspace of size ^ jg 

restricted to a subspace of size and W4 is restricted to a subspace of size 

Further since (^2,1,92,2) = Ext(u[4c-t+i,8t], ’’i) = iExt4(u4ct+i, n) o... oiExt4(u8ct, J’l), it follows 
by an application of Lemma [S 3 ] that for any fixed 52,1, 'i'[4C't+i,6t] is restricted to a subspace of size 
2‘^ct{ny-dz)_ ^ similar argument shows that for any fixed (/2,2) 'i’[6Ci+i,8Ct] is restricted to a subspace 
of size 22C*(”9-rf5), 

Finally, since IPi(x3,y3) = (91,1,91,2), it follows that for any fixed 2:3,91,1,91,2, the variable 1/3 
lies in a subspace of size 2”®“^°®!^"'^) since by fixing the variables 2:3,91,1,91,2, we are restricting 1/3 

to a subspace of dimension (toifc - 1) over the field F2,og(2„,). 

It is clear from the arguments that we did not use any specific values of the inputs given to the 
algorithm Samp4 (including the value of the bit b) to argue about the size of 2 ilaExt“^ (92,1,92,2)- 
Also note that each of X3,93,U[4,4(^i], t(;[4,4] is restricted to some subspace. Since 2ilaExt“^(92,i,92,2) 
is the cartesian product of these subspaces, it follows that it is a subspace over F2. Thus the lemma 
now follows since we can efficiently sample from a given subspace. □ 

Using arguments very similar to the above claim, we obtain the following result. 

Claim 8.9. For any /i G { 2 ,..., £}, any fixing of the variables {sh,i,rh,i,Sh,i,rh^i '■ i G { 1 , 2 }}, and 
any b G { 0 , 1 } define the set: 

2 ilaExt ^(9/1+1,1,9/1+1,2) = {(?^[8C(h-l)t-4Ct+l,8C(/i-l)t+4Ct]) w;[4/i_3,4/i]) G {0, ; 

2 ilaExt(u[ 8 c(/i-i)t+i, 8 Cht], w:[4/i-3,4/i], 9i,i, 91 , 2 , b) = (9/1+1,1,9/1+1,2)}. 

There exists an efficient algorithm Samp^.,.^ that takes as input qh+i,i,qh+i,2,b, {sh,i,rh,i,Sh,i,rh,i '■ 
i G { 1 , 2 }}, and samples uniformly from 2 ilaExt“^(9/1+1,!, 9/1+4,2). 

Further, 2 ilaExt“^(9/1+1,!, 9/1+1,2) is a subspace over ¥2, and its size does not depend on the 
inputs to Samp;i_,_i. 


□ 

We now show a way of efficiently sampling from the pre-image of the function inmExti (Algo¬ 
rithm 7 ). 

Claim 8 . 10 . For any string a G { 0 , 1 }^, and any fixing of the variables {sh,i, rh,i,Sh,i,rh,i '■ h ^ [i], 
i G { 1 , 2 }} define the set: 

inmExt5'^(9£+i,i,9£+i,2) = {(2:2,92) € {0, l}^"-^ . inmExti(x2,92,«) = (9£+i,i, 9£+i,2)}- 

There exists an efficient algorithm Sampum^ that takes as input {s/i,/, r/i,/, s/i,/, r/i,/ : h G {i],i G { 1 , 
2}}, a, 9 £+i,i, 9^+i,2, and samples uniformly from inmExt^^(9£+i,i, 9^+i,2). 

Further, inmExt^^(9£+i,i,9^+i,2) is a subspace over F2, and its size does not depend on the 
inputs to Samp^^^. 

Proof. We observe that once we fix all the seeds {sh,i, rh,i,Sh,i,rh,i '■ h ^\i],i ^ {1, 2 }}, for different 
h G [£], the blocks (^^[ 8 C(/i-i)t- 4 Ct+i, 8 C(/i-i)t+ 4 Ct]; w^[4ft-3,4h]) can be sampled independently. Thus, 
by using the algorithms {Samp/j^i : h ^ F} from Claim [STSl and Claim [ 8 ^ we sample the variable 
2^3, 93 ,'f 2 ’[l, 4 ])'*^[l, 4 Ct]) {^[ 8 C(h-l)t- 4 Ct+l, 8 C(h-l)t+ 4 Ct]) ^[ 4 h- 3 , 4 h] ^ h G [^]}. 
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Finally, since Ext(z;[ 8 C(£-i)t+ 4 Ct+i, 8 C«]: ^^+ 1 , 2 ), it follows by the arguments in 

Lemma [8.81 that the block U[8C'(£-i)t+4Ct+i,8Ctt)] restricted to a subspace of size 
Thus, we can efficiently sample this block as well. 

Further the variable u'[ 4 £+i, 8 £] is unused by the algorithm inmExti, and hence takes all values 
in {0,Similarly the variable is unused by the algorithm inmExti uud hence 

takes all values in {0, Thus, we sample these variables as uniform strings of the appropriate 

length. 

Since X 2 ,y 2 are concatenations of the various blocks sampled above, we can indeed sample 
efficiently from a distribution uniform on {(x 2 ,?/ 2 ) G {0,1}^"’^ : inmExt(x, y, a) = g£+i_ 2 )}. 

Eurther since by Claim 18.81 and Claim 18.91 the size of the pre-images of each of the blocks generated 
do not depend on the inputs (and is also a subspace), it follows that 2 inmExt^^(y£+i 4 , g£+i^ 2 ) i® ^ 
subspace, and its size does not depend on the inputs to Samp^^^. □ 

We now proceed to construct an algorithm to uniformly sample from the pre-image of any 
output of the function inmExt (Algorithm 6), which will yield the required efficient encoder for the 
resulting one-many non-malleable codes. 

Claim 8.11. For any fixing of the variable z = xi oxi o yi oy^ and the variables {sh,i,rh,i,Sh,i, 
Xh,i '■ h G \l]fi G {1,2}}, define the set: 

inmExt"^(g£+i,i,g£+i, 2 ) = {(x,y) G {0,1}^" : inmExt(x,y) = (g^+i^i, y£+i, 2 )}. 

There exists an efficient algorithm Samp„^ that takes as input {sh,i,rh^i,Sh,i,rh^i : h G [i]fi G {1, 
2 }},z,y£+i 4 ,g^+i, 2 , and samples uniformly from inmExt“^(y£+i,i, g'^+ 1 , 2 ). 

Further, inmExt“^((7£+i^i,^£+ 1 ^ 2 ) is a subspace over F 2 , and its size does not depend on the 
inputs to Samp,^^. 

Proof. We fix the variables xi and yi. Let T = Samp(z^) = {fi,..., tn^}. We now think of X 2 as an 
element in F = F 2 iog(n+i). Let X 2 = {x 2 ,i, ■ ■ ■ ,a: 2 ,n 4 )j where each X 2 ,i is in F. Recall that the 
X n generator matrix G of the code RS is the following: 


/ 1 

1 

• 1 \ 

ai 

0:2 

CXfi 

, 724 — 1 

V«i 

724 — 1 
«2 

■ <"“7 


where ai,... ,an are distinct non-zero field elements of F. 

Let 

/I 1 ••• 1 \ 


Gt = 

Oiti 





724 — 1 
\< 

724 — 1 

«t2 

Cn5 / 


Since xi = RS(x2){t}! we have the following identity: 



72,1 • • • 

3^2,714) Gt 

= Xi 

( 1 ) 


Thus, for any fixing of xi, the variable X 2 is restricted to a subspace of dimension (714 — ns) over 
the field F. 
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Now, let j G [ 77 , 4 ] be such that (x 2 ,i, • • •, 3 ^ 2 ,j) is the string (xs, U 7 [i^ 4 £]), and (x 2 ,j+i, • • •, X 2 ,n 4 ) is 
the string tt)[ 4 £_|_i g^j. Clearly, (774 — j) log n = 4 ^ 773 ;, and thus by our choice of parameters it follows 

tnat 3 — riA logn ~ 2 + log{n+l) < 3 < ^34 775. 

We further note since any 775 x 775 sub-matrix of Gt has full rank (since it is the Vandermonde’s 
matrix), it follows by the rank-nullity thorem that any j x 775 sub-matrix of Gt has null space of 
dimension exactly j — 775. Thus for any A G F”®, the equation: 


{X2,j+1 ■ 

^ 2 , 744 ) 


“72 

“7 \ 

Cn5 



„ riA — l 

„ >14 — 1 

^ "^4 —1 



ywi 

“72 

" Oif / 


( 2 ) 


has exactly [Fl^-^ solution. 

Thus, for any fixing of the variables, X 2 ,i, • • •, X 2 j-, equation (1) has exactly |Fp“"'® solutions. In 
other words, for any fixing of 373 , 7 U[i^ 4 £], xi, the variable 7 U[ 4 £_|_i^g£] is restricted to a subspace, and the 
size of the subspace does not depend on the fixing of X 3 , TUq 4 ^], xi. Using, a similar argument, we 
can show that for any fixing of ^ 3 , U[i^g( 7 t£], the variable vy^cu+i,igcu] is restricted to a subspace, 
and the size of the subspace does not depend on the hxing of 7 / 3 , v\i^^cu]-,yi- 

Now consider any fixing of the variables {sh,i,rh 3 ,Sh,i,rh,i ■ h G [P\,i G {1,2}}, z. As proved in 
the Claim [ 8 TOI we can efficiently sample the variables X 3 ,'W[i^ 4 £],y 3 ,V[i^gcte]- By the above argu¬ 
ment, the variables uj 4£_|_4 g£] and W[scu+i,iQCU] iie iii ^ subspace, and hence we can efficiently 
sample these variables as well. Thus we have an efficient procedure Samp„^ for uniformly sampling 
{x,y) from the set inmExt“^((/£+iq, ( 7 £_|_i^ 2 ) • 

It also follows by Claim [ 8 TOI that the total size of the pre-image of the variables x^,w\^i^ 4 p^, 
y3W[i,8Ctd does not depend on z or the variables {sh,i,rh,i,Sh,i,Xh,i ■ h G {i],i G {1,2}}. Further, 
for any fixing of 3 : 3 , 7 / 3 , Ufqgctf], 2 , as argued above, the variables and wi^cu+i,i&cu] 

now lie in a subspace, whose size does not depend on the fixed variables. Thus, overall the size of 
the total pre-image of x,y does not depend on the inputs to Samp„^. □ 

We now state the main result of this section. 

Theorem 8.12. There exists an efficient proeedure that given an input (/£+i^ 2 ) £ (0,1}""^ x 

{0,1}”'^, samples uniformly from the set {{x,y) : inmExt(x, 7 /) = (^£+ 1 , 1 , <?£+i, 2 )}- 

Proof. We use the following simple strategy. 

1. Uniformly sample the variables z,{sh,i,rh,i,Sh,i,Xh,i ■ h G [l],i G {1,2}}, 

2 . Use the variables sampled in Step (1) as input to the algorithm Samp„^ to sample {x,y). 

The correctness of this procedure follows directly from Claim [8TT1 since it was proved that for any 
hxing of the variables of Step 1, the size of pre-image of inniExt is the same. □ 
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